Changeset 12867
- Timestamp:
- 20 Apr 2015, 17:32:55 (10 years ago)
- Location:
- main/waeup.kofa/trunk
- Files:
-
- 4 edited
Legend:
- Unmodified
- Added
- Removed
-
main/waeup.kofa/trunk/docs/source/userdocs/datacenter/import.rst
r12863 r12867 4 4 *********** 5 5 6 .. contents:: Table of Contents 7 :local: 6 The term 'Data Import' actually understates the range of functions importers really have. As already stated, many importers do not only restore data once backed up by exporters or, in other words, take values from CSV files and write them one-on-one into the database. The data undergo a complex staged data processing algorithm. Therefore, we prefer calling them 'batch processors' instead of importers. The staged import process is described in the following. 7 8 1. File Upload 9 ============== 10 11 Users with permission 12 :py:class:`waeup.manageDataCenter<waeup.kofa.permissions.ManageDataCenter>` 13 are allowed to access the data center and also to use the upload page. On this page they can see a long table of available batch processors. The table lists required, optional and non-schema fields (see below) for each processor. It also provides a CSV file template which can be filled and uploaded to avoid header errors. 14 15 Data center managers can upload any kind of CSV file from their local computer. The uploader does not check the integrity of the content but the validity of its CSV encoding (see :py:func:`check_csv_charset<waeup.kofa.utils.helpers.check_csv_charset>`). It also checks the filename extension and allows only a limited number of files in the data center. 16 17 .. autoattribute:: waeup.kofa.browser.pages.DatacenterUploadPage.max_files 18 19 When the upload succeeded the uploader sends an email to all import managers (users with role :py:class:`waeup.ImportManager<waeup.kofa.permissions.ImportManager>`) of the portal that a new file was uploaded. 20 21 The uploader changes the filename. An uploaded file foo.csv will be stored as foo_USERNAME.csv where username is the user id of the currently logged in user. Spaces in filename are replaced by underscores. Pending data filenames remain unchanged (see below) 22 23 After file upload the data center manager can click the 'Process data' button to open the page where files can be selected for import (import step 1). After selecting a file the data center manager can preview the header and the first three records of the uploaded file (import step 2). If the preview fails or the header contains duplicate column titles, an error message is raised. The user cannot proceed but is requested to replace the uploaded file. If the preview succeeds the user can proceed to the next step (import step 3) by selecting the appropriate processor and an import mode. 24 25 2. File Header Validation 26 ========================= 27 28 Import step 3 is the stage where the file content is assessed for the first time and checked if the column titles correspond with the fields of the processor chosen. The page shows the header and the first record of the uploaded file. The page allows to change column fields or to ignore entire columns during import. It might have happened that one or more column titles are misspelled or that the person, who created the file, ignored the case-sensitivity of field names. Then the data center manager can easily fix this by selecting the correct title and click the 'Set headerfields' button. Setting the header fields is temporary, it does not change the file itself. 29 30 The page also calls the `checkHeaders` method of the batch processor which checks for required fields. If a required column title is missing, a warning message is raised and the user can't proceed to the next step. 31 32 3. Data Validation and Import 33 ============================= 34 35 Kofa does not validate the data in advance. It tries to import the data row-by-row while reading the CSV file. The reason is that import files very often contain thousands or even tenthousands of records. It is not feasable for data managers to edit import files until they are error-free. Very often such an error is not really a mistake made by the person who compiled the file. Example: The import file contains course results although the student has not yet registered the courses. Then the import of this single record has to wait, i.e. it has to be marked pending, until the student has added the course ticket. Only then it can be edited by the batch processor. 36 37 The core import method is: 38 39 .. automethod:: waeup.kofa.utils.batching.BatchProcessor.doImport() 40 :noindex: 41 42 43 -
main/waeup.kofa/trunk/docs/source/userdocs/datacenter/intro.rst
r12866 r12867 24 24 Administrators of web portals, which store their data in relational databases, are used to getting direct access to the portal's database. There are even tools to handle the administration of these databases over the Internet, like phpMyAdmin or phpPgAdmin to handle MySQL or PostgreSQL databases respectively. These user interfaces bypass the portals' user interfaces and give direct access to the database. They allow to easily import or export (dump) data tables or the entire database structure into CSV or SQL files. What at first sight appears to be very helpful and administration-friendly proves to be very dangerous on closer inspection. Data structures can be easily damaged or destroyed, or data can be easily manipulated by circumventing the portal's security machinery or logging system. Kofa does not provide any external user interface to access the ZODB_ directly, neither for viewing nor for editing data. This includes also the export and import of sets of data. Exports and imports are handled via the Kofa user interface itself. This is called batch processing which means either producing CSV files (comma-separated values) from portal data (export) or processing CSV files in order to add, update or remove portal data (import). Main premise of Kofa's batch processing technology is that the data stored in the ZODB_ can be specifically backed up and restored by exporting and importing data. But that's not all. Batch processors can do much more. They are an integral part of the student registration management. 25 25 26 .. note:: 27 28 Although exporters are part of Kofa's batch processing module, we will not call them batch processors. Only importers are called batch processors. Exporters produce CSV files, importer process them. 29 26 30 27 31 .. _ZODB: http://www.zodb.org/ -
main/waeup.kofa/trunk/src/waeup/kofa/browser/templates/datacenteruploadpage.pt
r11558 r12867 84 84 <a i18n:translate="" class="btn btn-primary btn-xs" 85 85 tal:attributes="href python: 'skeleton?name=' + importer['name']"> 86 Download CSV Skeleton File86 Download CSV File Template 87 87 </a> 88 88 </td> -
main/waeup.kofa/trunk/src/waeup/kofa/utils/batching.py
r12861 r12867 284 284 def doImport(self, path, headerfields, mode='create', user='Unknown', 285 285 logger=None, ignore_empty=True): 286 """Perform actual import. 286 """In contrast to most other methods, ``doImport`` is not supposed to 287 be customized, neither in custom packages nor in derived batch 288 processor classes. Therefore, this is the only place where we 289 do import data. 290 291 Before this method starts creating or updating persistent data, it 292 prepares two more files in a temporary folder of the filesystem: (1) 293 a file for pending data with file extension ``.pending`` and (2) 294 a file for successfully processed data with file extension 295 ``.finished``. Then the method starts iterating over all rows of 296 the CSV file. Each row is treated as follows: 297 298 1. An empty row is skipped. 299 300 2. Empty strings are replaced by ignore-markers. 301 302 3. The `BatchProcessor.checkConversion` method validates all 303 values in the row. If the validation fails 304 305 4. 306 307 5. 308 309 6. 310 287 311 """ 288 312 time_start = time.time()
Note: See TracChangeset for help on using the changeset viewer.