Skip to main content
Skip to "About government"
Language selection
Français
Government of Canada /
Gouvernement du Canada
Search
Search the website
Search
Menu
Main
Menu
Jobs and the workplace
Immigration and citizenship
Travel and tourism
Business and industry
Benefits
Health
Taxes
Environment and natural resources
National security and defence
Culture, history and sport
Policing, justice and emergencies
Transport and infrastructure
Canada and the world
Money and finances
Science and innovation
You are here:
Canada.ca
Library and Archives Canada
Services
Services for galleries, libraries, archives and museums (GLAMs)
Theses Canada
Item – Theses Canada
Page Content
Item – Theses Canada
OCLC number
1224180695
Link(s) to full text
LAC copy
Author
Tong, Zhou.
Title
A document exploring system on LDA topic model for Wikipedia articles.
Degree
Master of Science -- Acadia University, 2016
Publisher
[Wolfville, Nova Scotia] : Acadia University 2016
Description
1 online resource
Abstract
Organizing and exploring millions of documents, papers and other text information becomes a challenge for researchers and publishers. As machine learning techniques are quickly developed and widely used, a new text mining method called topic model was proposed in 2003. The topic model is based on Latent Dirichlet allocation (LDA) and has drawn much attention since it was introduced. LDA topic model is a probabilistic model, which can process text documents and exhibit hidden topics. Compared to other document processing methods working on content directly, the LDA topic model processes documents to topic distributions. The results are easier to understand, categorize and compare. Most importantly, topics make more sense to humans than structured machine formats. In the thesis, we briefly introduce the background knowledge of LDA topic model and its working principles. Then we deeply explain how to apply LDA topic model to a text corpus by doing experiments on Simple Wikipedia documents. The experiments include all necessary steps of data retrieving, pre-processing, fitting the model and evaluations. The result of the experiments shows the LDA topic model working effectively on document clustering and fnding similar documents. Meanwhile, based on LDA topic model, we propose a document exploring system which allows users to organize and explore the documents by topic where related documents are easier to fnd and access.
Other link(s)
scholar.acadiau.ca
Subject
LE3 .A278 2016
Date modified:
2022-09-01