Once the blocks of texts are identified, the OCR enables for the build-up of a model of text regions, words and letters from any images. Once Kevin Kwok decided to start on his project again, the technology for transcription, translation, text erasure, and modification flowed naturally afterwards.īefore the Optical Character Recognition (OCR) can be applied, it has to first identify whether blocks of text exists in an image. The relevant technology exists and was readily available for quite some time, yet for inexplicable reason, it hadn't been expanded for the application of translating texts from images. To him, selecting texts in pictures was something that was manageable on a technical level. This project eventually won him second place. It was only until Kevin Kwok went on to study at Massachusetts Institute of Technology(MIT) and entered a hackathon, that he picked up this project again. Faced with high technical difficulties, Kwok decided to abandon this project in 2012. It was less effective, proving that the process was strictly applicable only for horizontal machine printed text. However, carrying out the process vertically was unsuccessful as projections created were not readable. In order to determine the letter position, a similar process was carried out, but vertically this time. When horizontal lines are detected, each lines are automatically cropped, and the histogram process repeats itself until all horizontal lines in the image have been identified. The significant valleys of the resulting histograms served as a signature for the ends of text lines. He projected the image onto the side and a vertical pixel image histogram was formed.
#Firefox japanese ocr plugin software#
A particularly verbose comic inspired him to develop a software which can read images (with canvas), figure the positions of the lines and letters, and draw selection overlays to assuage a pervasive text-selection habit. Kwok noticed that they tend to converge and arrange themselves in a way that cut through the spaces in between letters. In May 2012, Kevin Kwok was reading about seam carving, an algorithm which was able to rescale images without distorting or damaging the quality of the image. Previously, the only way to search or copy a sentence from an image was to manually transcribe the text. The process of editing, copying or quoting text inside images was difficult before software such as Project Naptha arrived. The process of highlighting texts also inspired the naming of the project.ĭifficulty in translation of words from images The name Naptha is derived from Naphtha, which is a general term that originated few thousand years ago and refers to flammable liquid hydrocarbon. 1.1 Difficulty in translation of words from images.Project Naptha also makes use of a method called Stroke Width Transform (SWT), developed by Microsoft Research in 2008 as a form of text detection. The OCR technology that Project Naptha adopts is a slightly differentiated technology in comparison to the technology used by software such as Google Drive and Microsoft OneNote to facilitate and analyse text within images. The OCR enables the build-up of a model of text regions, words and letters from all images. īy adopting several Optical Character Recognition (OCR) algorithms, including libraries developed by Microsoft Research and Google, text is automatically identified in images. Similar technologies have also been employed to produce hardcopy art, and the identification of these works. The web browser extension uses advanced imaging technology. The reason behind the removal remains unknown. It was then made available on Mozilla Firefox, downloadable from the Mozilla Firefox add-ons repository but was soon removed. This software was first made available only on Google Chrome, downloadable from the Chrome Web Store. It was created by developer Kevin Kwok, and released in April 2014 as a Chrome add-on. Project Naptha is a browser extension software for Google Chrome that allows users to highlight, copy, edit and translate text from within images.