Wikipedia: The Popular Encyclopedia
Wikipedia is the most popular encyclopedia in the world, boasting over 6.5 million articles. It is the go-to source for research information and answers to all types of questions, ranging from pop culture trivia to the nitty-gritty details of science. People rely on Wikipedia for its knowledge, but how can one ensure the accuracy of the information?
Wikipedia is crowdsourced, meaning facts must be corroborated and quotations, controversial statements, and contentious material about living people must include a citation. This is challenging considering the 17,000 new articles added each month. To help manage these demands, automated tools can identify gibberish or statements without citations, but helping human editors determine if a source truly supports a claim is a much more complex task.
Developing AI Tools
At Meta AI, researchers have developed the first model capable of automatically scanning thousands of citations in order to check their accuracy. This involves training a neural network to understand the relationship between complex text passages. To find sources, a dataset of 134 million webpages was created. This new dataset is significantly more intricate than previous ones, and it calls attention to questionable citations.
Verifier in Action
The verifier is trained on complicated statements and full websites to determine their accuracy. For instance, a better source was found for a claim in the Wikipedia article “2017 in Classical Music”. The original footnote linked to a press release from a different organization, but the evidence-ranking model found a blog post on Violinist.com that said the same thing.
Towards Smarter Systems
This project is helping to develop the building blocks of the next generation of citation tools. Meta AI is also training neural networks to learn language and pinpoint relevant source material. This type of AI can be used to help volunteers scale their work and provide reliable information to billions of people around the world. Ultimately, this project will help research the development of smarter AI systems that can reason about real-world knowledge and make sense of the vast information available on the web.
Technomancer is a science and tech enthusiast who enjoys writing about software and AI and other tech topics.