Introduction
•Information explosion led to difficulties in finding required information
•Search Engine – an important tool for getting information on internet
•Current search engines lack accuracy and personalization
•Due to rapid development of internet, effective and accurate Intelligent Search Engine based on web mining technology has become the most important research issue
Evolution of Search Engines
•First Search Engine – 1994 à World Wide Web Worm
•As the number of websites increased new techniques were required to get accurate search results
•Most of the available search engines return several thousands of pages, most of which are irrelevant
•Quality and relativity of searching results can be improved using following three technologies:
1.Clustering the web documents
2.Analyzing the web hyperlinks structure
3.Analyzing the web usage logs
More in the presentation Download the Presentation
Friday, January 18, 2008
Monday, January 14, 2008
Progress Review Meeting - Jan 2008, Rajesh Kumar
Download this Report
The proposed research work aims at exploiting the web usage regularities and information structures in web pages to build intelligent information systems.
Problems Identified:
Need for improving Precision and Speed in Structure Mining
Query based Classification using Links
Statistical Approaches to link mining with meta-data discovery and mapping
Applying Link Mining in Semantic Web
Combine Information extraction with techniques from link mining to construct Semantic Web
To make use of Semantic and Ontological information in Link Mining Endeavors
Filtering sequential patterns and clusters
Need to develop tools, which incorporate statistical methods, visualization, and human factors to help better understand the mined knowledge.
Course work completed:
Advanced Internet Technologies.
Paper Published:
Saravanan.P, Dr.D.Sridharan, Rajesh Kumar.C, “The Current Approaches in Automatic Identification of Fragments and Informative Sections”, International Conference on Trendz in Information Sciences and Computing, Sathyabama University,Chennai, 2007
The proposed research work aims at exploiting the web usage regularities and information structures in web pages to build intelligent information systems.
Problems Identified:
Need for improving Precision and Speed in Structure Mining
Query based Classification using Links
Statistical Approaches to link mining with meta-data discovery and mapping
Applying Link Mining in Semantic Web
Combine Information extraction with techniques from link mining to construct Semantic Web
To make use of Semantic and Ontological information in Link Mining Endeavors
Filtering sequential patterns and clusters
Need to develop tools, which incorporate statistical methods, visualization, and human factors to help better understand the mined knowledge.
Course work completed:
Advanced Internet Technologies.
Paper Published:
Saravanan.P, Dr.D.Sridharan, Rajesh Kumar.C, “The Current Approaches in Automatic Identification of Fragments and Informative Sections”, International Conference on Trendz in Information Sciences and Computing, Sathyabama University,Chennai, 2007
Progress Review Meeting - Jan 2008, Sanjay Sugumar
The proposed research work aims at developing algorithms for building an efficient system for information extraction from web, using suitable soft computing approaches.
Problems Identified:
Simple blind keyword based query processing in search engines
No deductive capability in information extraction
No soft decision in classifying the web content
Lack of personalization as using user’s history and nature
Web services are ignored when extracting information from web
Simple host based clustering algorithms are used
Present techniques don’t mine linguistic association rules
Course work completed:
Advanced Databases (Instead of Database Technology)
Problems Identified:
Simple blind keyword based query processing in search engines
No deductive capability in information extraction
No soft decision in classifying the web content
Lack of personalization as using user’s history and nature
Web services are ignored when extracting information from web
Simple host based clustering algorithms are used
Present techniques don’t mine linguistic association rules
Course work completed:
Advanced Databases (Instead of Database Technology)
Progress Review Meeting - July 2007, Sanjay Sugumar
The proposed research work aims at developing algorithms for building an efficient system for information extraction from web, using suitable soft computing approaches.
Problems Identified:
Simple blind keyword based query processing in search engines
No deductive capability in information extraction
No soft decision in classifying the web content
Lack of personalization as using user’s history and nature
Web services are ignored when extracting information from web
Simple host based clustering algorithms are used
Present techniques don’t mine linguistic association rules
Course work completed:
Data Warehousing and Data Mining
Advanced Internet Technologies
Subscribe to:
Posts (Atom)