2K in 10 Days » WWW News » Tailored News Powered by Semantic Technologies
Tailored News Powered by Semantic Technologies
WWW Information
khairil in the information berita harian 2006 02 22

news snippets with khairil yusoff.
Tailored Information Powered by Semantic Technology
WWW Information
ctrl, the very first study and development (R&D) product of PRAGMATECH (www.pragma-tech.com), is a library for text processing that performs semantic analysis on (information-like) textual paperwork. The API of ctrl can be used for document summarization, extraction of crucial topics, and, most importantly, it can be utilised to index and retrieve paperwork by principles and matters/subject-matter as opposed to crucial words and crucial phrases.
With the advent of the Semantic Web, there has been an abundance of sloppy use of the phrase ‘semantics’. Some have even claimed to have programs that ‘understand’ organic language. At PRAGMATECH, on the other hand, we are effectively informed of the multitude of technical and theoretical issues that hinder any real progress in normal language processing (NLP) due to issues in the semantic treatment method of a quantity of phenomena in normal language, such as scope and reference resolution, textual entailment, metonymy, nominal compounds, intentionality, and metaphor, just to identify a few.
Having explained that, it is essential to point out that issues in NLP research do not automatically indicate that state-of-the-art data retrieval programs that go over and above the capabilities of statistical and keyword indexing can’t be formulated. This is merely due to the truth that the retrieval of appropriate documents is not equivalent to language comprehending although the latter subsumes the former. Comprehending a piece of text is a a lot broader functionality and is a need to when the objective is natural language query/answering, but an ‘expert’ need to have not entirely understand a piece of text to basically figure out its ‘aboutness’, i.e., to decide what a specific document is about.
Therefore, and notwithstanding our conviction that as of however there are no techniques that truly ‘understand’ all-natural language text, we do feel that semantic (or topic-based) retrieval can be attained. It is in this context that we use the expression ‘semantics’ right here – i.e., we use the expression ‘semantics’ in the restricted context of document retrieval to imply retrieval of documents that are semantically (or topically) associated, as opposed to crucial word–based retrieval designs. To achieve this, the analysis of the material of documents in ctrl is centered on the examination of concepts and subjects (compound concepts), as opposed to words and phrases.
In ctrl we method and evaluate ideas, not words, and this incorporates names of things (folks, countries, organizations, goods, and many others.). Thus, from words we try to infer the most most likely meaning in context by a approach acknowledged as Word Sensation Disambiguation (WSD), which is an essential part of the semantic analysis procedure.
The accuracy of our WSD algorithm in inferring the most probable which means of a term in some context is above 80%, and, as far as we know, this accuracy charge is by far much increased than any WSD final results that have been reported in the computational linguistics study.
Past going from words to ideas (meanings), in ctrl we have moved from principles to subjects (which can be assumed of as compound principles that are composed from smaller more primitive principles). Thus, whilst (a distinct that means of) ‘system’ refers to a basic subject, ‘information systems’ is a more intricate topic, and so is ‘information administration systems’, and so forth.
Consequently, although it is an essential portion of the approach, going from words to concepts is not the end target in and of by itself given that topics of interest are seldom described by single words, but are typically expressed by intricate linguistic objects (primarily by nominal phrases) this sort of as “drug trafficking organizations”, “latest NBA Television analyst Eric Snow”, “the upcoming Garry Marshall flick Valentine’s Day”, and so forth.
]]>
Heading from words to ideas, and from basic concepts to matters (or a lot more sophisticated concepts) is still not plenty of to attain higher premiums of precision and recall in retrieving relevant paperwork. Even if we are highly correct in inferring the right meaning of words in context, and even if we subsequently combined ideas into topics, what we are in the end interested in is the set of important matters in a document, or simply what the document is primarily “about”, irregardless of what other words, ideas or matters are also talked about in it.
In ctrl we give words meanings, such as names of items. To do so fundamental reference resolution and entity identification has to be carried out. Names of individuals, organizations, movies, and many others. are not consequently just words and phrases, but are full-fledged concepts that can be associated to and matched with other concepts and topics. Thus, although the sentence “reputation of former NY Yankees slugger Babe Ruth” is not at all connected to “Dr. Ruth recognition in NY”, there is evidently some semantic relation among “US President Barak Obama” and “British Prime Minister Tony Blair” because of, between other items, to the semantic relationship between the idea ‘Barak Obama’ (who is a president) and the idea of a ‘prime minister’.
Combining the process of Term Sense Disambiguation (WSD) and treating subjects as the simple developing blocks (as opposed to single words or even solitary ideas) allows ctrl to conceptually (or semantically) relate matters that are expressed in different words, these kinds of as “drug trafficking organizations” and “organized crime syndicates”, for case in point.
The flip facet of this is that matters that might be composed of comparable ideas can probably refer to entirely various topics. For illustration, even if we resolve the meanings of “business” and “insurance”, an “insurance policy firm” is a totally distinct notion/subject from “company insurance coverage”. Equally, and even however the words “engineering”, “information” and “company” refer to the identical ideas in “knowledge about the use of technological innovation in the mining organization” and “the use of knowledge mining engineering in business”, the two phrases refer to fully different matters. In fact, the latter is far more relevant to “the commercial programs of machine learning”, though the two phrases are expressed making use of totally distinct sets of words.
ctrl is a library with an API that can be used to extract meta info (important subjects, and entities) of any textual document, to generate a document summary, and to index and retrieve documents by subjects as opposed to important words.
ctrl can be utilised in a number of domains for various purposes (besides the evident software in research). For instance, in the news industry ctrl gives a tool to instantly generate the ‘story highlights’, categorize and index any report based on its subjects, and suggest associated tales. This in turn allows the two inner and exterior (consumers) in the retrieval of associated paperwork for any required subject it further makes it possible for automatic data push (could be in the type of RSS Feeds) for user-chosen topics. In this context ctrl can be utilised in effective specific advertising and marketing considering that users are retrieving highly appropriate information that specifically matches their matters of interest.
ctrl can also be the new regular in the company intelligence industry for intelligent subject-based mostly enterprise lookup. Its potential to offer pertinent documents based mostly on subjects is a every day need to have in huge corporations. The numerous answers that are currently employed in the industry are quite high priced and are time- and error-inclined considering that they depend on skilled subject (or meta-information) engineering for successful performance.
In the Intelligence neighborhood, ctrl signifies cutting time and price that is getting expended on processing a large amount of paperwork ‘manually’ looking for relevant details about distinct topics, given that the identification of topics (and a lot more importantly, the identification of the important matters) is the most essential differentiator in between ctrl and existing programs.
Other applications can be formulated all around ctrl’s API performance to provider numerous other fields particularly when integrated with existing computer software (e.g., database systems, desktop and document administration resources, and so forth.)
ctrl has gone via a series of checks each for technical and non-technical good reasons. In addition to the regular software program good quality assurance tests, thorough testing has been executed to validate our WSD algorithms (employing SemCore and our very own test assortment), as properly as the era of meta-information, and document summaries. The precision and recall of document retrieval employing ctrl’s topic-primarily based indexing and retrieval was also extensively examined utilizing our own collections (obtained from various media resources) as effectively as the widely utilized Reuters Check Collection.
Ctrl-Information is a free on the web information support that permits users to receive personalized information using the CTRL semantic engine. End users subscribe to the support by submitting a profile, which is one particular or far more “topic(s) of interests”. Users can do this by tracking present tales, by pasting or typing a few of paragraphs that describe the story they like to track, or by getting into some subjects of interest from a certain class (e.g., Info Technologies >> Cloud Computing, or Politics >> Terrorism, and so forth.)
Ctrl-Information then fetches everyday information stories that are semantically/topically associated to a user’s matter(s) of interests. Alongside with each news tale retrieved, users can see an instantly-generated summary, a listing of the essential subjects for a news write-up, the important entities discovered (folks, places, merchandise, organizations, and many others.), as nicely as a set of topically related stories discovered on that day.
Indicator-up to this provider for free of charge (http://www.ctrl-information.com/), preserve time sifting via a multitude of irrelevant tales and start off getting tailored and intelligently filtered information stories. If you like the support we would like to listen to from you. Also, you are welcome to invite buddies that could also be interested in employing or testing this support.
Check out CTRL-News at: http://www.ctrl-news..com/?artwork=9
Rebels battle loyalists on Tripoli's streets
WWW Information
The humanitarian predicament there is increasingly tough, he said, with lengthy electrical power and h2o outages. by leoch_battery August 26, 2011 3:58 AM EDT It is not good information. More information from Leoch Global, http://www.leoch.com by jgg000101 …
WWW News question by deep: really should be use anchor and a reporter on our online news channel www.prativad.com?
need to be use anchor and a reporter on our on the web news channel www.prativad.com, remember to give us some sugestions by observing our news videos at www.prativad.com
WWW Information ideal reply:
Remedy by tuggers2001
Many thanks for the 2 factors and your Documented for advertising and marketing
Filed under: WWW News · Tags: News, Powered, Semantic, Tailored, Technologies








Recent Comments