Paper by Manuel Sánchez-Gestido presented within the context of the workshop-seminar Interactivos?'08: Juegos de la visión, celebrated in Medialab-Prado from May 30 through June 14, 2008.
Abstract:
We, as human beings, receive most of the sensorial information associated to the outside world from our eyes. But a large amount is processed subconsciously by our brain and, as long as it is very difficult to make explicit all that activity, it is also extremely hard to convert it in something that would be machine-processable. Only in recent years the field of Computer Vision has evolve so as to develop a number of robust and efficient techniques that would allow, after a learning/training process, to extract information from a scene in a similar way human beings would be doing. Once this is achieved, all the visual information that is made available over the Internet (video, photography, etc), in particular huge amounts of user-generated contents, will become automatically inter-connected by hypervisual links in addition to the already existing hypertextual links, extending the current Web in a drastic manner, creating, all of a sudden, new networks of visual information uncovered by the automatic processing of those contents.
This paper presents the Computer Vision and Semantic technologies that are becoming available for this challenge. In addition, a number of potential applications are described, improving tremendously the way visual information is incorporated in the digital world and thus in our lives.