Entry
Reader's guide
Entries A-Z
Subject index
Optical Character Recognition
Optical character recognition (OCR) uses machines—computers and sometimes other peripherals such as scanners—and software to recognize printed text characters. OCR enables users to digitally “read” and store printed text, thereby reducing the amount of typing required for inputting text. Scanning and interpreting text may seem simple, but the technologies required are complex; the potential uses are broad, important, and may be scaled toward increasingly general and difficult recognition problems.
The first general-use OCR system was created by futurist Ray Kurzweil in 1976. His company, Computer Products, Inc., released a commercial version two years later.
OCR is an analog-to-digital process; it begins with analog materials, and converts them to digital data. In order to scan printed words (typically, black text printed on a white page), a computer must utilize a charged coupled device (CCD)—a scanner. The CCD is charged by light, and successively codes and records the reflections at each point on the image. Scanning a text document turns it into a bitmap. The OCR software then analyzes the light and dark areas of the bitmap, and translates the results to a computer for storage and output. The success of the process depends in large part on the quality of the source material (clarity of the image being scanned), the effectiveness of the hardware (for example, the resolution ability of the scanner being used), and the quality of the software (its level of sophistication and accuracy).
The most difficult part of the process is the translation from light/dark images to words, a procedure that requires sophisticated pattern recognition. The OCR software must match shapes to character definitions. Since there is an abundance of fonts (letter and number character-shape sets) and languages (many using a wide variety of letter and number character-shape sets), there are many complexities and ambiguities that must be recognized, interpreted, and translated. Introducing non-regularized characters, like those produced by handwriting, further complicates the process.
OCR software must first find and note the boundary and region of a pattern; it must then match the object to items with which the object is conceptually related. The context in which the item is used informs the machine's ability to make choices about its identity. For example, one does not often find two instances of the letter “y” together in English words, so if there are two adjacent shapes that both look like “y,” at least one of them is probably something else. The software must then select the proper representation pattern. Since pattern recognition is a sort of under-constrained problem—one in which there isn't enough information to know that the arrived-at solution is uniquely correct—the system must also include other means of evaluating the accuracy of the solution.
The algorithms in OCR and other character-recognition software work using complex trial-and-error functions that model and simplify human cognitive systems. Using fuzzy logic and concepts of neural networking, computers (and programs) can be “taught” to recognize ambiguous or previously unknown characters (via repetitive trial and error using complex algorithms), thereby enabling scanners and OCR software to recognize handwriting or exotic languages, and/or to respond to idiosyncratic written input.
...
- Art, Music, and Performance
- Business and Commerce
- http://Amazon.com
- http://MP3.com
- Business-to-Business
- Cookies
- Customer Relationship Management
- Digital Cash
- Disintermediation
- E-Commerce
- Harold Innis
- Internet Service Providers
- Jakob Nielsen
- Jeff Bezos
- Knowledge Management
- Local Area Network
- Margaret Whitman
- Metrics
- Napster
- Narrowcasting
- Personalization
- Peter Drucker
- Security
- Stephen M. Case
- Steven P. Jobs
- Telecommuting
- Trademark
- Video Conferencing
- William H. Gates, III
- Cyberculture
- “A Manifesto for Cyborgs”
- Neuromancer
- The New Hacker's Dictionary
- The Soul of a New Machine
- Understanding Media: The Extensions of Man
- Allucquère Rosanne Stone
- Avatar
- Blog
- Bruce Sterling
- CommuniTree
- Convergence
- Cyberculture
- Cyberethics
- Cyberfeminism
- Cyberpunk
- Cyberspace
- Cyberwarfare
- Donna J. Haraway
- Electronic Civil Disobedience
- Electronic Democracy
- Electronic Frontier Foundation
- Emoticons
- Esther Dyson
- Gender and New Media
- Habitat
- Hacking, Cracking, and Phreaking
- Hacktivism
- Howard Rheingold
- Instant Messaging
- Interactvity
- John Perry Barlow
- Killer Application
- LambdaMOO
- Marshall McLuhan
- Meme
- Metrics
- Mitchell Kapor
- Nicholas Negroponte
- Online Journalism
- Peer-to-Peer
- Race and Ethnicity and New Media
- Sherry Turkle
- Virtual Community
- William Gibson
- Hacking
- 2600: The Hacker Quarterly
- The New Hacker's Dictionary
- CommuniTree
- Computer Emergency Response Team
- Copyleft
- Cyberculture
- Cyberethics
- Cyberwarfare
- DeCSS
- Electronic Civil Disobedience
- Electronic Frontier Foundation
- Encryption and Cryptography
- Eric Raymond
- Hacking, Cracking, and Phreaking
- Hacktivism
- John Perry Barlow
- Mitchell Kapor
- Richard Stallman
- Security
- Virus
- Legal Topics
- 2600: Hacker Quarterly
- Bernstein vs. the U.S. Department of State
- United States vs. Thomas
- Anonymity
- Carnivore
- Child Online Protection Act and Child Online Privacy Protection Act
- Communications Decency Act
- Copyleft
- Copyright
- DeCSS
- Digital Millennium Copyright Act
- Electronic Civil Disobedience
- Electronic Communications Privacy Act
- Electronic Frontier Foundation
- Hacking, Cracking, and Phreaking
- Linking
- Napster
- Obscenity
- Pamela Samuelson
- Privacy
- Security
- Networks and Networking
- ARPANET
- BITNET
- Broadband
- Browser
- Bulletin Board Systems
- Cellular Telephony
- CommuniTree
- Community Networking
- Distributed Computing
- Firewall
- Freenet (Community Network)
- Freenet (File-Sharing Network)
- Internet
- Internet Appliances
- Internet Corporation for Assigned Names and Numbers
- Internet Engineering Task Force
- Internet Relay Chat
- Internet Service Providers
- LISTSERV
- Local Area Network
- Marc Andreessen
- Markup Languages
- Minitel
- MUDs and MOOs
- Napster
- Newsgroups
- Peer-to-Peer
- PLATO
- Satellite Networks
- Short Messaging System
- Telephony
- Tim Berners-Lee
- Usability
- vBNS
- Videotex
- Whole Earth ‘Lectronic Link’
- Wireless Application Protocol
- Wireless Networks
- World Wide Web
- Open-Source Software
- Organizations and Labs
- Association for Computing Machinery
- Computer Emergency Response Team
- Electronic Frontier Foundation
- Institute of Electrical and Electronic Engineers
- Internet Corporation for Assigned Names and Numbers
- Internet Engineering Task Force
- Media Lab, Massachusetts Institute of Technology
- SIGGRAPH
- SRI International
- World Wide Web Consortium
- Xerox Palo Alto Research Center
- People
- Alan Kay
- Alan Turing
- Allucquère Rosanne Stone
- Anita Borg
- Bill Joy
- Brenda Laurel
- Brian Eno
- Bruce Sterling
- Claude Shannon
- Daniel Sandin
- Donna Hoffman
- Donna J. Haraway
- Douglas Englebart
- Edward Tufte
- Eric Raymond
- Esther Dyson
- George Lucas
- Hal Varian
- Hans Moravec
- Harold Innis
- Howard Rheingold
- Ivan Sutherland
- J. C. R. Licklider
- Jakob Nielsen
- Jaron Lanier
- Jeff Bezos
- John Carmack
- John Perry Barlow
- John von Neumann
- Kai Krause
- Laurie Anderson
- Lawrence Lessig
- Manuel Castells
- Marc Andreessen
- Margaret Whitman
- Marshall McLuhan
- Marvin Minsky
- Michael Joyce
- Mitchell Kapor
- Nam June Paik
- Nicholas Negroponte
- Pamela Samuelson
- Pattie Maes
- Peter Drucker
- Raymond Kurzweil
- Richard Stallman
- Robert Moog
- Rodney Brooks
- Seymour Papert
- Sherry Turkle
- Stephen M. Case
- Steven P. Jobs
- Stewart Brand
- Theodor Holm (Ted) Nelson
- Thomas DeFanti
- Tim Berners-Lee
- Vannevar Bush
- Vinton Cerf
- W. Daniel Hillis
- William Gibson
- William H. Gates, III
- Social Issues
- Access
- Anonymity
- Carnivore
- Cyberethics
- Cyberfeminism
- Cyberwarfare
- Digital Divide
- Disposal of Computers
- Education and Computers
- Electronic Civil Disobedience
- Electronic Democracy
- Encryption and Cryptography
- Gender and New Media
- Hacking, Cracking, and Phreaking
- Hacktivism
- Obscenity
- Patent
- Privacy
- Race and Ethnicity and New Media
- Security
- Spam
- Technological Determinism
- Universal Design
- Virtual Community
- Technology
- ARPANET
- Authoring Tools
- Bluetooth
- Broadband
- Browser
- Bulletin Board Systems
- Carnivore
- CAVE
- CD-R, CD-ROM, and DVD
- Cellular Telphony
- Chat
- Codec
- Compression
- Computer-Supported Collaborative Work
- Content Filtering
- Cookies
- DeCSS
- Desktop Video
- Digital Asset Management
- Digital Subscriber Line
- Digital Television
- Distributed Computing
- Emulation
- Encryption and Cryptography
- Expert Systems
- Firewall
- Flash
- Graphical User Interface
- Habitat
- Hypermedia
- Hypertext
- Instant Messaging
- Interactive Television
- Internet
- Internet Appliances
- Internet Relay Chart
- Java
- Linux
- Local Area Network
- Markup Languages
- MIDI
- Minitel
- MP3
- MPEG
- Object-Oriented Programming
- Optical Character Recognition
- Optical Computing and Networking
- Peer-to-Peer
- Personal Digital Assistants
- Photoshop
- Qube
- Robotics
- Satellite Networks
- Shockwave
- Short Messaging System
- Sketchpad
- Software Agents
- Streaming Media
- Telecommuting
- Telephony
- vBNS
- Videoconferencing
- Videotex
- Virus
- Wireless Application Protocol
- Wireless Networks
- World Wide Web
- Writing
- “A Manifesto for Cyborgs”
- “As We May Think”
- “Man-Computer Symbiosis”
- “The Cathedral and the Bazaar”
- 2600: The Hacker Quarterly
- Neuromancer
- The New Hacker's Dictionary
- The Soul of a New Machine
- Understanding Media: The Extensions of Man
- Bruce Sterling
- Cyberpunk
- Electronic Publishing
- Emoticons
- Hypertext
- Michael Joyce
- William Gibson
- Loading...
Get a 30 day FREE TRIAL
-
Watch videos from a variety of sources bringing classroom topics to life
-
Read modern, diverse business cases
-
Explore hundreds of books and reference titles
Sage Recommends
We found other relevant content for you on other Sage platforms.
Have you created a personal profile? Login or create a profile so that you can save clips, playlists and searches