Patent Information Retrieval and Mining, Infringement and Invalidation

Last Updated October 10, 2011

Welcome to the home of the Multimedia Information System Technology Group (MIST)! MIST is an interdisciplinary research group based in the Computer Science Department at the University of California, Los Angeles (UCLA), in collaboration with the UCLA School of Medicine and the Department of Radiological Sciences. The group is immersed in competitive and exciting R&D in the fields of multimedia technology and its application, multimedia database management, searching for interesting information in the internet instead of just relevant content, patent information retrieval and mining, and constantly working on the fringe of the cutting edge. In this site, you will find information about the people who are a part of MIST as well as information about the work of the group, both in the past and present.

Several new R&D thrusts are being undertaken including mobile medical patient data management and access, image stream data management and visualization, intelligent searching and retrieval of interesting information in the internet, patent information retrieval and mining, and social networks. Our advances and innovations are described in our publications listed in this web site.

iScore
The definition of what makes an article interesting - or its “interestingness” - varies from user to user and is continually evolving, calling for adaptable user personalization. Furthermore, due to the nature of news articles, most are uninteresting since many are similar or report events outside the scope of an individual’s concerns. There has been much work in news recommendation systems, but none have yet addressed the question of what makes an article interesting. In our system, iScore, we make the following contributions to news filtering in a limited user environment:

We show that filtering based on only topic relevancy is insufficient for identifying interesting articles.

We extract a variety of features, ranging from topic relevancy to source reputation. No single feature can characterize the interestingness of an article for a user. It is the combination of multiple features that yields higher quality results. For each user, these features have different degrees of usefulness for predicting interestingness.

We evaluate several classifiers for combining these features to find an overall interestingness score. Through user-feedback, the classifiers find features that are useful for predicting interestingness for the user.

Current evaluation corpora, such as TREC, do not capture all aspects of personalized news filtering systems necessary for system evaluation.

MiScore (Multimedia Interestingness Score)
With the explosion of the availability of multimedia information (text, pictures/photos, video, audio, and various combinations) on the Internet, we aim to enhance the above iScore effort to include multimedia “interestingness”, in addition to finding textual information of interest to a particular user. We focus on needed innovations that will allow identifying documents (from news feeds, document databases, etc) of interest based on both textual and pictorial/photographic/visual information, while filtering the higher volumes of relevant but uninteresting documents.
An example query: Find medical documents that deal with cancer of the prostate, successful treatments AND that include images/photos of cancerous prostate and charts showing PSA progression through time. This is not possible in today's world without massive manual search/review of the many relevant documents retrieved by current browsers to see if such images are present in each document. Our focus is not image processing, but rather use such existing technologies in our innovations to determine multimedia interestingness.

Hot News and Articles in the Internet
Our aim is know-how, algorithms and a system for near real-time news extraction capable of identifying up-to-date “hot news” or "hot topics" from large amounts of news reports or other article corpora on the internet. Given all news articles on the current day p and previous news archives, we wish to identify the hot news articles on day p. We have defined hotness, the important characteristics of hotness, and developed models and algorithms to obtain the hot news articles in experiments with Yahoo RSS news feeds. We have been able to obtain the hot news in our experiments with good results. We are aiming in our research and experiments to obtain hot articles under various user interests from other types of corpora in several domains.

Patent Information Retrieval and Mining, Infringement and Invalidation.
We consider various practical concerns in patent legal practice, such as patent information retrieval and mining, infringement, prior art search and validation and invalidation of patents. We aim to advance intelligent retrieval and mining of patents by content, taking into account more semantics (shallow semantics) of content and intent of user searching. We consider the use of not only isolated words but also specific relationships between words as important semantic characteristics of interest in patent searching. These advances are most important in patent infringement and validation and invalidation matters.

The objectives, highlights, accomplishments, and publications of the previously completed research and development projects lead/co-lead by Professor Cardenas appeared in the extensive websites The Knowledge-Based Multimedia Medical Distributed Database System (KMED) and the Multimedia Stream System (MMSS) projects.

Alfonso F. Cardenas, Ph.D
cardenas@cs.ucla.edu
Computer Science Department
Universtiy of California, Los Angeles

HOME | MIST GROUP | PUBLICATIONS | ACTIVITIES GALLERY | RELATED LINKS

MIST | Multimedia Information System Technology Group Website | UCLA Computer Science Department

The Multimedia Information System Technology Group Web site is maintained by Jose Rodriguez-Salinas