Large-Scale Text Mining on GPU
Xiaohui Cui (2009)
Paper’s reference in the IEEE style?
X. Cui, “Large-Scale Text Mining on GPU,” Computational Sciences and Engineering Division Oak Ridge National Laboratory, 2009.1)X. Cui, “Large-Scale Text Mining on GPU,” Computational Sciences and Engineering Division Oak Ridge National Laboratory, 2009.
How did you find the paper?
If applicable, write a list of the search terms you used.
- "GPU text analysis"
Was the paper peer reviewed? Explain how you found out.
Does the author(s) work in a university or a government-funded research institute? If so, which university or research institute? If not, where do they work?
The author works at Oak Ridge National Laboratory in the USA and is a research scientist in the field of Artificial Intelligence.
What does this tell you about their expertise? Are they an expert in the topic area?
The author is an expert in the field
What was the paper about?
This is a presentation on large scale text mining using GPU enhanced computing. Analysis includes text encoding, dimension reduction and document clustering.
The research was in support of how to analyse very large volumes of streaming documents such as that being monitored by the US Department of Homeland Security.
GPUs provide new computing approach whereby hundreds of streaming processors (SP) on a GPU chip simultaneously
communicate and cooperate to solve complex computing problems
High performance and massively parallel:
- CPU : ~4 cores ( Quad Core) (30~40 GFLOPS)
GPU : ~240 cores (Nvidia GTX 280 ) (933 GFLOPS)
- GPUs: $400~$700
Quad Core CPU: $1000)
High memory bandwidth:
- CPU: 21 GB/s;
- GPU: 142 GB/s
Energy Efficient:Simple architecture optimized for compute intensive task and energy efficiency.
- GPU: 0.21w/GFLOPS;
- CPU: 1.43w/GFLOPS
Significant improvement in processing when using GPUs Single GPU overall six times faster than single threaded CPU for the analysis.
A 10 x GPU cluster was able to convert documents to TF-IDF vectors 30x faster than 10 CPUs
Significant improvement when using GPUs for SVD document dimension reduction
When using agent based flocking for clustering of documents, GPU were able to cluster the documents nearly 40x faster.
If applicable, is this paper similar to other papers you have read for this assignment? If so, which papers and why?
This presentation covers text mining and document clustering using GPUs and is related to:
If applicable, is this paper different to other papers you have read for this assignment? If so, which papers and why?
What do these similarities and differences suggest? What are your observations? Do you have any new ideas? Do you have any conclusions?
In order to process and obtain insights from large volumes of unstructured data, different techniques and equipment are necessary.
With the ongoing increase in social media, email and other unstructured data, the ability to increase the speed and reduce the cost of processing is going to become increasingly important. This presentiation indicates that GPUs can be used to achieve these goals.
This question is to be answered after your critical analysis is completed. Which sections (if any) of your critical analysis was this paper cited in?
References [ + ]
|1.||↑||X. Cui, “Large-Scale Text Mining on GPU,” Computational Sciences and Engineering Division Oak Ridge National Laboratory, 2009.|