Friday, January 18, 2008

Web Mining based Intelligent Search Engines

Introduction


•Information explosion led to difficulties in finding required information
•Search Engine – an important tool for getting information on internet
•Current search engines lack accuracy and personalization
•Due to rapid development of internet, effective and accurate Intelligent Search Engine based on web mining technology has become the most important research issue


Evolution of Search Engines


•First Search Engine – 1994 à World Wide Web Worm
•As the number of websites increased new techniques were required to get accurate search results
•Most of the available search engines return several thousands of pages, most of which are irrelevant
•Quality and relativity of searching results can be improved using following three technologies:
1.Clustering the web documents
2.Analyzing the web hyperlinks structure
3.Analyzing the web usage logs

More in the presentation Download the Presentation

Monday, January 14, 2008

Screening PPT - Sanjay Sugumar

Download this Presentation

Presentation prepared for PhD Screening

Screening PPT - Saravanan

Presentation prepared for PhD Screening.

Download this Presentation

Progress Review Meeting - Jan 2008, Rajesh Kumar

Download this Report

The proposed research work aims at exploiting the web usage regularities and information structures in web pages to build intelligent information systems.

Problems Identified:

Need for improving Precision and Speed in Structure Mining
Query based Classification using Links
Statistical Approaches to link mining with meta-data discovery and mapping
Applying Link Mining in Semantic Web
Combine Information extraction with techniques from link mining to construct Semantic Web
To make use of Semantic and Ontological information in Link Mining Endeavors
Filtering sequential patterns and clusters
Need to develop tools, which incorporate statistical methods, visualization, and human factors to help better understand the mined knowledge.

Course work completed:

Advanced Internet Technologies.

Paper Published:

Saravanan.P, Dr.D.Sridharan, Rajesh Kumar.C, “The Current Approaches in Automatic Identification of Fragments and Informative Sections”, International Conference on Trendz in Information Sciences and Computing, Sathyabama University,Chennai, 2007

Screening PPT - Rajesh Kumar

Presentation prepared for PhD Screening

Download Presentation

Progress Review Meeting - Jan 2008, Sanjay Sugumar

The proposed research work aims at developing algorithms for building an efficient system for information extraction from web, using suitable soft computing approaches.

Problems Identified:

Simple blind keyword based query processing in search engines
No deductive capability in information extraction
No soft decision in classifying the web content
Lack of personalization as using user’s history and nature
Web services are ignored when extracting information from web
Simple host based clustering algorithms are used
Present techniques don’t mine linguistic association rules

Course work completed:

Advanced Databases (Instead of Database Technology)

Progress Review Meeting - July 2007, Sanjay Sugumar

Download this Report

The proposed research work aims at developing algorithms for building an efficient system for information extraction from web, using suitable soft computing approaches.

Problems Identified:

Simple blind keyword based query processing in search engines
No deductive capability in information extraction
No soft decision in classifying the web content
Lack of personalization as using user’s history and nature
Web services are ignored when extracting information from web
Simple host based clustering algorithms are used
Present techniques don’t mine linguistic association rules

Course work completed:

Data Warehousing and Data Mining
Advanced Internet Technologies