среда, 22 января 2014 г.

Program to read djvu

Many people use the program to read different formats.
Format became very popular format. .djvu
He lived by the fact that it is easy to gather books of pictures
and photos. You can also convert scanned documents without recognition. Djvu solo program used to read 



the documents format .dijvu. There are certainly many other programs to read this format, but for some 


reason  dijvu solo program has become a favorite for many PC users.


1. What kind of format is - "DjVu"? Why would he? What does he look like?
DjVu - a graphical format. Ie DjVu-files - it's just "pictures". The main useful property of DjVu is that he can best (compared with all other graphics formats) "compress" the scanned images of pages of paper books - which is why this format is widely used for creating electronic versions of paper books. More information on the DjVu format look at our homepage. 2. Exactly how to use the format DjVu?
Typically, this is done as follows: first need some way to get ordinary graphic files - "Pictures" - pages containing images of paper books. Typically, they are obtained by scanning the paper books. Then these common graphic files are encoded in a format DjVu, why their size is greatly reduced - without losing image quality text. The resulting DjVu-file is used as an e-book: it can be read from the screen, you can communicate with each other such homemade files over the internet, you can print on your printer.
But generally encoded in DjVu format and can be a text file (via a virtual DjVu-printer) - just unlikely that makes sense - a text file always takes much smaller than any corresponding image file. 3. Which program to encode files in a format DjVu (OC Windows)?
Use DjVu Small. This program uses a "special" command line utilities from LizardTech - so it is no worse than the original package LizardTech Document Express Enterpise with DjVu 5.1. DjVu Small but more convenient to use and has a small size.
In addition, for this purpose, you can use the program: LizardTech Document Express Enterpise with DjVu 5.1 (DEE 5.1), LizardTech Document Express Enterpise with DjVu 5.1 Light Edition (DEE 5.1 ​​LE), Document Express Editor (version 5 or 6), DjVu Solo 3.1 . However, all of these programs for various reasons are less preferred (or they are huge in size, making it difficult to download or extremely inconvenient for batch encoding). 4. Which program to decode DjVu-files to regular image files (OC Windows)?
Use DjVu OCR 2 or higher. It is the only program that does not crash when decoding if a multipage DjVu-file some "pages" damaged.
In addition, for this purpose, you can use the program: DjVu Small, Irfan View + Irfan View DjVu Plugin v3.95 or higher, and any ordinary virtual printer. However, all of these programs for various reasons are less preferred (or the probability of failure, or the relative difficulty settings). 5. How to decode one multipage DjVu-file into a plurality of single-(conventional) image files?
Use DjVu OCR 2 or higher. It takes as input a single multipage file and decode it into a plurality of single-simple graphic files.
By the way - it is generally one of the practical ways of working with the format DjVu - preferential work with multipage files. If we are working with conventional graphics files, try to deal mainly with the single-page files (although there are simple and multi-image files), in the case of DjVu - on the contrary - everyone is trying to deal with a multi-page DjVu-file, rather than a set of components his single-page DjVu-files. This is due to the fact that ordinary image files we can edit and DjVu-files - no, but can only view them and decode a file.
Therefore, you can not even directly convert many single page DjVu-files in the corresponding set of simple single-image files - You first need to collect many existing single-page DjVu-files into one multi-page DjVu-file, and then decode it to the desired set of simple single-image files - a policy of LizardTech (judging by DEE 5.1) - so it is implemented also in DjVu Small.
Added 28.02.2011: Today, the most recommended program to decode DjVu - it MiniDjVu Plus v1.3 (893 KB). While that is the only program that when decoding "broken" page of a multipage DjVu-file (as well as empty white) creates a clean white sheets (and all other programs in the best case, simply ignore such DjVu-page when decoding). In the future we plan to add this functionality in DjVu Small. 6. How to assemble a multi DjVu-file of a plurality of single page DjVu-files?
Use Document Express Editor v5.0.0 Build 16 plus (without OCR and reference). Navigate to the first single-sequence DjVu-file, then choose Edit -> Insert Page (s) After ..., in the dialog box, select all the other single-page DjVu-files from a given sequence, and then save the multipage file. 7. How to add a one DjVu-file into an existing multipage DjVu-file?
See the previous question - you need to do exactly the same action, it is necessary not only enhance the set, and this one single page DjVu-file. 8. What view DjVu-files (OC Windows)?
Use WinDjView. This is the best choice - this program is superior to all other DjVu-viewers - and even "branded" browser-DjVu-plugin from LizardTech. 9. Both directly convert Pdf-file DjVu-file (Pdf -> DjVu)?
To use just a straight conversion package Pdftodjvu LE. The advantage of this conversion is that in this case of Pdf-file into DjVu-file transferred OCR-layer and hyperlinks (but sometimes there are problems with the Russian encodings).
However, not always make sense to do it direct conversion Pdf -> DjVu. Unlike the format DjVu, Pdf format is not graphic. Format Pdf - is a container format that can contain any type of data - text, images, tables, and even audio and video. Pdf format is usually divided into these types:
- Vector: in a Pdf-file contains only the character information - text, tables, charts, formulas, drawings. Such Pdf-files (apparently they look like the text in printed text) is best convert DjVu format directly - get a great quality and smallest size DjVu-file.
- Raster: in a Pdf-file contains only formalized information - bitmaps. Such Pdf-files (apparently they look like a scanned image of the text) is best convert DjVu format not directly - but the scheme "Pdf -> Tif -> DjVu". All the matter is that the direct converter Pdftodjvu LE "locked up" is to vector Pdf-files and bitmap encodes it bad - DjVu-produce larger files.
- Raster-vector: Such Pdf-files, it is usually better to convert to a format DjVu NOT directly - Scheme "Pdf -> Tif -> DjVu". Only in rare cases, you can apply direct conversion "Pdf -> DjVu". It all depends on the particular Pdf-file, the type and ratio of raster and vector content. Try both options and select the best conversion. 10. Both directly convert DjVu-file Pdf-file?
This can be done using a conventional virtual printer, for example, Acrobat Distiller (installed automatically when you install the full Adobe Acrobat (non-Reader'a) (showed the best results), FinePrint v5.29, pdfFactory or LEADTOOLS ePrint v4.0. However, not all of them work correctly with the Russian language.
If such conversion OCR-layer and hyperlinks are not migrated from DjVu-file in Pdf-file.
And yet there is a catch: if you print from the browser-DjVu-plugin, you can not send a file to the print option in the browser menu "File" in the "Print" - because of this DjVu-file is printed as blank sheets (apparently some implementation error), and should, by right-clicking on the file is displayed in the browser, select the context menu item "Print ..." etc. 11. Both directly convert DjVu-file to Word Doc-file?
It's impossible. Format DjVu - a graphical rather than text format. Therefore DjVu-files do not contain any information about the structure of the document - in the best case, only the embedded text OCR-layer - so that information for such direct perekonvertatsii take into DjVu-file simply nowhere.
Therefore, to create a file to Word Doc-of DjVu-file uses the traditional way: DjVu-converted file in any conventional image format (see section 5) and is recognized in the ABBYY FineReader v7.0 (or v8.0). In recent releases of ABBYY FineReader v9.0 provides direct support to read format DjVu (when forming the packet to recognition). 12. What if when you open the DjVu-file in DjVuSolo v3.1 NonComm it displays a message: "DjVu Decoder: Cannot decode DjVu files with version> = 25"?
There are 2 ways to solve this problem:
A. Open the file and work with it in the Document Express Editor v5.0.0 Build 16 plus (without OCR and reference) (2.0 MB), and not in DjVuSolo v3.1 NonComm.
B. Downgrade this djvu-file using the utility DjVuVersion and then it will be the possibility to work with him in DjVuSolo v3.1 NonComm. 13. Do DjVu-files built-in text OCR-layer?
Well, yes and no. That is the opportunity to add a text file DjVu-OCR-layer exists (provided Specification DjVu). Generally, the content DjVu-file recognized by ABBYY FineReader v7.0 (or v8.0) and then the recognition result is inserted into the DjVu-file through a program DjVuOCR 2.1.
But at the same time, there are many DjVu-files where such a text layer is absent (ie, was not created by anyone and is inserted) - and because of that there is still a misconception that DjVu-file allegedly not support built-in OCR-layer. 14. Why insert text OCR-layer DjVu-file?
The question is not as simple as it may seem. There are people who make DjVu-book (ie DjVu-files) and not intentionally inserted there OCR-layer - because because this increases the size of e-mail. books and therefore it becomes more expensive to download from the Internet.
Usually text OCR-layer inserted into DjVu-book just for convenience - to reading this book, you could select the desired piece of text with the mouse, copy it to the clipboard and paste into a text file.
By the way - built-in text OCR-layer does not contain any information about the structure of the document - it's just a simple continuous text without any formatting (although DjVu-file while there is still overhead information and the location of words (or characters - optional) text).
But there is infinitely more important and less obvious reason - why you need to insert a text OCR-layer DjVu-books - in order to enable searching for information within a set DjVu-books by keyword (if it is not necessary for you personally - it need those who will benefit from your book among others).
For such a search using different software - for example, Google Desktop Search + DjVu Indexer Plugin or (best) dtSearch + LizardTech DjVu iFilter.
Within a DjVu-books can be searched by keyword directly into DjVu-viewer - for example, in WinDjView (in browser-DjVu-plugin can also be).
It is possible that someday popular search engines of the Internet have made their support for such a possibility (although, unfortunately, there is just the opposite trend).
However, not all DjVu-book makes sense to insert text OCR-layer - basically, only DjVu-books teaching reference books. For example, in entertainment DjVu-books (such as voyages of Captain Cousteau) Paste text OCR-layer of meaning is almost there. 15. How to disassemble a multipage DjVu-file on one page DjVu-files?
And why is it you? Such a need should almost never occur. Want to decode multi DjVu-file in a set of simple single-image files? Or convert it into a Pdf? Or perform other transformation? So this does not need to pre disassemble multipage DjVu-file into its "pages" - see previous questions.
The general principle is:
With DjVu format most convenient to work in the form of multi-file - and not in the form of a corresponding plurality of single-DjVu-files.
But what if you want to make a selection from the existing multi-page DjVu-file - that is, you need to issue a subset of "pages" multi-page DjVu-file as a new single multipage DjVu-file?
The most you can do it correctly so: decode the desired range of "pages" multipage DjVu-file in the corresponding set of simple single-image files (using DjVu Small), and then encode the resulting single-image files into a new DjVu-file (also using the DjVu Small , and do OCR if necessary).
And why you can not perform this task disassembly-assembly of a multi DjVu-file? And whether it is possible to do such a disassembly-assembly?
Yes, this disassembly-assembly can be done. But it is extremely desirable - much larger file size. The reason is that when you create a multi-page DjVu-file any LizardTech DjVu-encoder uses the so-called "Dictionaries" (another name - "djbz-dictionaries"). Dictionary - is a special trick to reduce the size multipage DjVu-file. The dictionary contains a list of geometric shapes, repeated on several pages and DjVu-general for them. As a rule, the letters of the text. Dictionary usually automatically created for every 10-20 pages (a configurable parameter) multi-page DjVu-file DjVu-encoder encoding and automatically integrated into the resulting multi-page DjVu-file.
Thus, in a multi-page DjVu-file dictionaries are repeated every 10-20 pages, and dictionaries stored symbols common for a group of 10-20 pages - which ultimately gives a solid reduction in the size multipage DjVu-file.
Virtually all of the current multi DjVu-files have the built-in dictionaries.
When disassembling a multipage DjVu-file on one page parsing program has to add symbols to the dictionary each received a one-page DjVu-file - which leads to an increase in their size. In my test case to obtain an increase by about half. And when reassembling the received single-DjVu-files resulting multipage DjVu-file remains overvalued size equal to the sum of the sizes "collected" single-page DjVu-files - that is, dictionaries are not restored during assembly.
If you, in spite of all these explanations, still want to do a similar disassembly-assembly, it will give a description of this procedure.
Let's start with disassembly: for this operation, you can use Document Express Editor v5.0.0 Build 16 plus (without OCR and Help) - extract manually separate each "page" in a multi-page DjVu-file and store it separately via the "Save Page As ..." - Get a matching set of one-DjVu-files.
However, such a manual way of dismantling unacceptable for large DjVu-files. To facilitate this procedure, I made a special simple console program DjVu Unbundler (275 KB).
Assembly received "dismantled" one-page DjVu-files can be carried out as usual - with the help of the program Document Express Editor v5.0.0 Build 16 plus (without OCR and reference) - see question number 6. 16. What if I have no DjVu-viewer does not work?
Disconnect the 2nd kernel on your 2-core microprocessor (At least using the Task Manager) - and it will work. 17. Why should not one big dictionary in the book when encoding to DjVu?
Responsible Cptn_Cook:
1) If the file looks on the network, nothing will be displayed until the entire load the desired dictionary. For a typical book, dictionary size is 200-300 KB, it is essential for the modem.
2) If the edit DjVu-file using DjVuLibre, then cutting and gluing very quickly leads to the fact that dictionaries begin to repeat (not hard to see why). Moreover, if the order of 10 KB dictionary, it is not terrible, but if 200 KB, already unpleasant.
3) DjVu - quite resistant to format errors, but if an error occurs in the dictionary, then no page that uses this knowledge, is displayed (read the whole book in pages-per-dict = 10000).
4) A book with a large vocabulary slowly opens into DjVu-viewer - it is noticeable to the eye and causes irritation to the user.
5) Large dictionary uses too much memory - when working with WinDjView.
In fact, it is not necessary to do pages-per-dict = 10000. This value gives a gain of 20% of the file size, but if you use the pages-per-dict = 20, the gain (compared to pages-per-dict = 10) 10%. It seems that this is a good compromise. Incidentally, any2djvu used pages-per-dict = 20, but then changed to pages-per-dict = 10, probably because in this case, the processing faster. 18. How to print a file format DjVu?
Download and install WinDjView. Open it DjVu-file, select the menu item "File" - "Print" and print it. 19. How to convert directly to Word Doc-format file to DjVu?
There are 2 options:
1. Install "virtual DjVu-printer" and opened Doc-file in Word, print it to this virtual printer.
2. If you have a MS Word 2007, download http://tokage.celartem.com/beta Office 2007 with DjVu Exporter, install your him and Doc-opening the file in Word 2007, save it in a format DjVu. 20. How to encode a set of single-image files in the corresponding set of single-DjVu-files?
Better not to do it. Most accurately encode a plurality of conventional graphic files into one multi DjVu-file - because this creates dictionaries separated characters that significantly reduces the size of the resulting DjVu-file.
If you it's still really really need to, use Document Express Enterprise 5.1 (can Light Edition). 21. How to encode DjVu color magazines?
Encode scans processed in whole or in DjVuPhoto (using DjVu Small or MiniDjVu Plus), or PDF-JPEG2000 (Adobe Acrobat Pro). 22. How to implement full text search on the pages set DjVu-books? 

Комментариев нет:

Отправить комментарий