Meta Developing a Neural Network to Turbocharge Wikipedia

Meta AI, the artificial intelligence unit of Meta Platforms has developed what it believes is a machine learning model that can simultaneously scan hundreds of thousands of Wikipedia citations to check their accuracy. While the Wikimedia Foundation, which runs Wikipedia, already uses bots, Meta’s proposal would be more extensive than anything currently deployed. Trained using a dataset of 4 million Wikipedia citations, the new Meta AI tool analyzes the linked references, verifying corroboration. With more than 17,000 new Wikipedia articles added each month, this is no small feat.

Founded in 2001, Wikipedia now hosts more than 6.5 million articles in English (and more than 11 million total, according to Wikipedia.org). That compendium is curated by more than 100,000 volunteer editors, reports Digital Trends, which describes Wikipedia “as one of the largest-scale collaborative projects in human history.”

Constant “tweaks and modifications” result in the most popular Wiki articles being “edited thousands of times, reflecting the very latest research, insights, and up-to-the-minute information,” explains DT.

Fabio Petroni, lead research tech manager for Meta AI’s FAIR (Fundamental AI Research) team says the group was “driven by curiosity” to see if they could improve on the process. “We wanted to see what was the limit of this technology,” he told Digital Trends.

In a blog post, Meta writes that its AI app “calls attention to questionable citations, allowing human editors to evaluate the cases most likely to be flawed without having to sift through thousands of properly cited statements.”

Eventually the “goal is to build a platform to help Wikipedia editors systematically spot citation issues and quickly fix the citation or correct the content of the corresponding article at scale.” Meta notes that it is not in partnership with Wikimedia on the project, which it describes as “still in the research phase” and not operationally deployed.

Petroni’s group also wrote a technical paper that refers to the neural network based system as SIDE.

Digital Trends describes “Google’s trillion-dollar PageRank algorithm” as the gold standard in filtering high-quality source material, calling it “certainly the most famous algorithm ever built around citations” and noting that at present, Meta AI’s SIDE ”has nothing like” Google’s weighted modeling.

Petroni told Digital Trends the group is very interested in “trying to model explicitly the trustworthiness of a source, the trustworthiness of a domain,” which would certainly have broad applications. In addition to working in multiple languages, the goal is also to have the models processing several types of media, including video and images, in addition to text.

Although still a work in progress, “Wikipedia editors can now test SIDE and assess its usefulness,” writes Review Geek, noting “the project is also available on Github.”

No Comments Yet

You can be the first to comment!

Sorry, comments for this entry are closed at this time.