Difference between revisions of "Semantic tagging"
|  (Making this clear in layman's terms) | |||
| Line 1: | Line 1: | ||
| − | Semantic tagging is a way to classify certain entities. The tagging process  | + | '''Semantic tagging''' is a way to classify certain entities. The tagging process consists of two primary steps: | 
| − | # Identify the instances that shall be classified (tagged). | + | # Identify the instances that shall be classified (tagged); e.g., ''cities''. | 
| − | # Classify (tag) these instances by assigning certain categories. | + | # Classify (tag) these instances by assigning them to certain categories, if applicable; e.g., ''national capitals''. | 
| − | ===ASK=== | + | Almost all of the semantic tagging that's taking place in [[Directory:Centiare|Centiare]] will fall into two types of scenarios -- '''Relations''' and '''Attributes'''. | 
| + | |||
| + | Suppose you were writing an article about the city of Berlin, Germany.  You could easily type out in the article that "'''Berlin is the capital of the unified country of Germany, and there are nearly 3.4 million people living in its metropolitan area.'''"  That's really good encyclopedia information. | ||
| + | |||
| + | However, if someone searches for the exact phrase "capital of Germany" or "population of Berlin", your sentence that happens to answer both of those questions would not be returned by either of those particular text searches.   What we hope to see in Centiare is active use of the semantic tagging process, so that such information is more likely to be found -- by either humans typing in commands, or machines programmed to find information.   | ||
| + | |||
| + | So, the essence of semantic tagging is, somewhere in the Berlin article text, or in an infobox, or even in an addendum at the bottom of the article, if you want to create a semantic link that describes a "capital-relationship", this is done by writing: | ||
| + | |||
| + |  <nowiki>[[capital of::Germany]]</nowiki> | ||
| + | |||
| + | Note the use of two (2) colons in succession.  You've just created a semantic tag '''Relation'''. | ||
| + | |||
| + | Furthermore, if you want to create a semantic link that describes a "population-attribute", this is done by writing: | ||
| + | |||
| + |  <nowiki>[[population:=3,396,990]]</nowiki> | ||
| + | |||
| + | Note the use of the colon and equal sign in succession.  You've just created a semantic tag '''Attribute'''. | ||
| + | |||
| + | When people use the [[Special:SearchTriple|"Search Triple"]] feature in Centiare, they will have utmost confidence and success in finding the articles they are looking for, if they correctly use the search forms, and you've correctly tagged your articles for semantic searching. | ||
| + | |||
| + | The possibilities for this are literally limitless, for both businesses and individuals.  Imagine conducting a search for a male, born in Michigan between 1965 and 1968, who has interests in both skiing and poker.  Do you think that would be easy with MySpace, Wikipedia, or Google?  Fat chance.  But on Centiare, it will be a [[Directory:Gregory_J._Kohs|piece of cake]]. | ||
| + | |||
| + | ===The '''ASK''' function=== | ||
| Centiare uses a parser function called '''[[Centiare:ASK|ASK]]''' to perform free-form queries. ASK provides a ready complement to the [[Centiare:Search Triple|Search Triple]] query-by-form facility. | Centiare uses a parser function called '''[[Centiare:ASK|ASK]]''' to perform free-form queries. ASK provides a ready complement to the [[Centiare:Search Triple|Search Triple]] query-by-form facility. | ||
Revision as of 03:26, 18 December 2006
Semantic tagging is a way to classify certain entities. The tagging process consists of two primary steps:
- Identify the instances that shall be classified (tagged); e.g., cities.
- Classify (tag) these instances by assigning them to certain categories, if applicable; e.g., national capitals.
Almost all of the semantic tagging that's taking place in Centiare will fall into two types of scenarios -- Relations and Attributes.
Suppose you were writing an article about the city of Berlin, Germany. You could easily type out in the article that "Berlin is the capital of the unified country of Germany, and there are nearly 3.4 million people living in its metropolitan area." That's really good encyclopedia information.
However, if someone searches for the exact phrase "capital of Germany" or "population of Berlin", your sentence that happens to answer both of those questions would not be returned by either of those particular text searches. What we hope to see in Centiare is active use of the semantic tagging process, so that such information is more likely to be found -- by either humans typing in commands, or machines programmed to find information.
So, the essence of semantic tagging is, somewhere in the Berlin article text, or in an infobox, or even in an addendum at the bottom of the article, if you want to create a semantic link that describes a "capital-relationship", this is done by writing:
[[capital of::Germany]]
Note the use of two (2) colons in succession. You've just created a semantic tag Relation.
Furthermore, if you want to create a semantic link that describes a "population-attribute", this is done by writing:
[[population:=3,396,990]]
Note the use of the colon and equal sign in succession. You've just created a semantic tag Attribute.
When people use the "Search Triple" feature in Centiare, they will have utmost confidence and success in finding the articles they are looking for, if they correctly use the search forms, and you've correctly tagged your articles for semantic searching.
The possibilities for this are literally limitless, for both businesses and individuals. Imagine conducting a search for a male, born in Michigan between 1965 and 1968, who has interests in both skiing and poker. Do you think that would be easy with MySpace, Wikipedia, or Google? Fat chance. But on Centiare, it will be a piece of cake.
The ASK function
Centiare uses a parser function called ASK to perform free-form queries. ASK provides a ready complement to the Search Triple query-by-form facility.
