Semantic Web/Screen Scraping, and Forms

For the Semantic Web to reach its full potential, many people need to start publishing data as RDF. Where is this information going to come from? A lot of it can be derived from many data publications that exist today, using a process called "screen scraping". Screen scraping is the act of literally getting the data from a source into a more manageable form (i.e., RDF) using whatever means come to hand. Two useful tools for screen scraping are XSLT (an XML transformations language), and regular expression (in Perl, Python, and so on).

However, screen scraping is often a tedious solution, so another way to approach it is to build proper RDF systems that take input from the user and then store it straight away in RDF. Data such as you may enter when signing up for a new mail account, buying some CDs online, or searching for a used car can all be stored as RDF and then used on the Semantic Web.