The existing airborne geophysical data is the accumulation of exploration and scientific research achievements in the past 50 years, which truly records the progress and development of the central airborne geophysical technology and embodies the pioneering role and outstanding contribution of airborne geophysical exploration in geological prospecting. In order to realize the information management and permanent utilization of airborne geophysical data and expand the field of airborne geophysical services, the principles of digitalization and informatization of airborne geophysical data are "respecting history, being loyal to the original work, keeping the original appearance, unifying requirements, strengthening monitoring and ensuring quality".
Second, the digitization process of text data
Digitization of airborne geophysical data is to input the written data of airborne geophysical exploration and scientific research projects into electronic documents, scan the handwritten and mimeographed written data into image format and save them, and write a brief introduction of the project results (Figure 8-5).
Figure 8-5 Flowchart of Text Data Digitization
Third, the method of digitization of written materials.
The essence of digitization of written materials is to input the existing reports of airborne geophysical exploration and scientific research achievements in the center into the computer through scanning identification and manual input, and re-edit them according to the format and layout specified by the airborne geophysical information system to form Word documents (DOC format) and Adobe Acrobat documents (PDF format) to meet the warehousing requirements.
(A) the quality classification of written materials and media
In order to digitize the written materials with reasonable digitization methods, 639 written materials (reports) of exploration and scientific research achievements are divided into four categories according to the quality of paper materials, the clarity of handwriting and illustrations. (Table 8-4). Among them, the data before 1973 are all poor, and the poor and medium quality data all appeared before the mid-1980s. The quality of project data after the mid-1980s is good.
Table 8-4 Statistical Table of Text Data Media Quality Classification
(2) Digitization method of written materials.
On the basis of data classification, according to the principle of text data digitization, the specific methods of text data digitization are determined.
1. Manual input method
Because most of the poor and medium-sized materials are handwritten, copied or mimeographed, and a few are printed in lead; Stored for a long time, after several moves, it is seriously damaged and the handwriting is blurred; You can't input it by scanning and identifying, you can only input it manually. Specific requirements are put forward for input work to ensure the quality of manual input.
2. Scan recognition input method
Good and good materials are printed and published by standard lead printing or Word document (DOC format), with clear handwriting, and can be input by scanning recognition, which is more efficient than manual input method.
Manually input and edit 608 written reports of aviation geophysical exploration projects, about 2.998+0 million words; Manually input and edit 306 texts of aviation geophysical research projects, about 8.398+0.00000 words.
In order to ensure the long-term preservation of the central data, at the same time, all written data of 6 17 manually entered are about 97 19 pages (including exploration 1 15 copies, about 3240 pages; 502 scientific research papers (about 6479 pages) were scanned and input, and the CD was directly carved into PDF format for preservation.
Fourth, proofreading and checking.
Text data is manually entered or scanned for recognition. All the proofs of Word electronic documents formed by self-checking must go through secondary proofreading and some tertiary proofreading or sampling inspection before they can enter editing and typesetting, and finally generate a unified DOC-format Word document.
Due to historical reasons, there are other mistakes or irregularities in early manuscripts (including words and illustrations). On the premise of maintaining the authenticity of the manuscript, try to correct and deal with the problems in the proof as much as possible, while some of them are left blank because of their own omissions or ambiguities. According to the errata attached to the original text materials, the text was corrected one by one, and the original errata basically lost its meaning.
Vectorization of verb (abbreviation of verb) illustration scanning
Using MapGIS software, 65,438+0,260 illustrations, such as survey sketch map, section map and geological interpretation map, in the original report are vectorized according to the original illustrations. The scale of the original illustration is converted into linear scale by digital scale, and the geological symbols in the geological map are unified by adding the section scale to the single section map, which improves the illustration quality of the text report (Figure 8-6).
Sixth, editing and typesetting
In addition to the objective differences in the contents of written reports of exploration projects and scientific research projects, even similar reports have great differences in report contents, typesetting and editing. This is an inevitable reflection of the progress of aviation geophysical exploration technology, and it is also related to the lack of unified standards and requirements for written materials in the past. In the process of digitization of written materials, it is convenient for computer management and service according to the unified requirements of the project. On the premise of keeping the contents of the report unchanged, using Word editing and typesetting software, all the input written reports are automatically catalogued according to the prescribed editing template, and the report cover and other forms are unified (there are differences between the final proof draft and the original in report cover, directory level, illustrations, page position, etc., but the report content has not changed). According to this requirement, the typesetting of 799 written materials with about 47,645 pages was completed, and the format of the written materials was converted from DOC format to PDF format.
Figure 8-6a Diagram before Vectorization
Figure 8-6b Vectorization Illustration
Seven. Compilation of project introduction
In order to enable users to quickly understand the general situation of the project and obtain the main contents and information of the results report without reading the full text of the project results report, according to the requirements of building the database of the aviation geophysical exploration information system, 455 profiles of aviation geophysical research projects, with about 280,000 words, have been compiled. Among them, there are 423 reports of airborne geophysical exploration results, about 260 thousand words; There are 32 reports on scientific research achievements, with about 20,000 words. Summarize the main contents of the project achievements in condensed form, including working methods or research methods, quality evaluation, main achievements and final conclusions.