[6519] | 1 | ## |
---|
| 2 | ## imagestorage.py |
---|
| 3 | ## Login : <uli@pu.smp.net> |
---|
| 4 | ## Started on Mon Jul 4 16:02:14 2011 Uli Fouquet |
---|
| 5 | ## $Id$ |
---|
| 6 | ## |
---|
| 7 | ## Copyright (C) 2011 Uli Fouquet |
---|
| 8 | ## This program is free software; you can redistribute it and/or modify |
---|
| 9 | ## it under the terms of the GNU General Public License as published by |
---|
| 10 | ## the Free Software Foundation; either version 2 of the License, or |
---|
| 11 | ## (at your option) any later version. |
---|
| 12 | ## |
---|
| 13 | ## This program is distributed in the hope that it will be useful, |
---|
| 14 | ## but WITHOUT ANY WARRANTY; without even the implied warranty of |
---|
| 15 | ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
---|
| 16 | ## GNU General Public License for more details. |
---|
| 17 | ## |
---|
| 18 | ## You should have received a copy of the GNU General Public License |
---|
| 19 | ## along with this program; if not, write to the Free Software |
---|
| 20 | ## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
---|
| 21 | ## |
---|
| 22 | """A storage for image files. |
---|
[7063] | 23 | |
---|
| 24 | A few words about storing files with ``waeup.sirp``. The need for this |
---|
| 25 | feature arised initially from the need to store passport files for |
---|
| 26 | applicants and students. These files are dynamic (can be changed |
---|
| 27 | anytime), mean a lot of traffic and cost a lot of memory/disk space. |
---|
| 28 | |
---|
| 29 | **Design Basics** |
---|
| 30 | |
---|
| 31 | While one *can* store images and similar 'large binary objects' aka |
---|
| 32 | blobs in the ZODB, this approach quickly becomes cumbersome and |
---|
| 33 | difficult to understand. The worst approach here would be to store |
---|
| 34 | images as regular byte-stream objects. ZODB supports this but |
---|
| 35 | obviously access is slow (data must be looked up in the one |
---|
| 36 | ``Data.fs`` file, each file has to be sent to the ZEO server and back, |
---|
| 37 | etc.). |
---|
| 38 | |
---|
| 39 | A bit less worse is the approach to store images in the ZODB but as |
---|
| 40 | Blobs. ZODB supports storing blobs in separate files in order to |
---|
| 41 | accelerate lookup/retrieval of these files. The files, however, have |
---|
| 42 | to be sent to the ZEO server (and back on lookups) which means a |
---|
| 43 | bottleneck and will easily result in an increased number of |
---|
| 44 | ``ConflictErrors`` even on simple reads. |
---|
| 45 | |
---|
| 46 | The advantage of both ZODB-geared approaches is, of course, complete |
---|
| 47 | database consistency. ZODB will guarantee that your files are |
---|
| 48 | available under some object name and can be handled as any other |
---|
| 49 | Python object. |
---|
| 50 | |
---|
| 51 | Another approach is to leave the ZODB behind and to store images and |
---|
| 52 | other files in filesystem directly. This is faster (no ZEO contacts, |
---|
| 53 | etc.), reduces probability of `ConflictErrors`, keeps the ZODB |
---|
| 54 | smaller, and enables direct access (over filesystem) to the |
---|
| 55 | files. Furthermore steps might be better understandable for |
---|
| 56 | third-party developers. We opted for this last option. |
---|
| 57 | |
---|
| 58 | **External File Store** |
---|
| 59 | |
---|
| 60 | Our implementation for storing-files-API is defined in |
---|
| 61 | :class:`ExtFileStore`. An instance of this file storage (which is also |
---|
| 62 | able to store non-image files) is available at runtime as a global |
---|
| 63 | utility implementing :class:`waeup.sirp.interfaces.IExtFileStore`. |
---|
| 64 | |
---|
| 65 | The main task of this central component is to maintain a filesystem |
---|
| 66 | root path for all files to be stored. It also provides methods to |
---|
| 67 | store/get files under certain file ids which identify certain files |
---|
| 68 | locally. |
---|
| 69 | |
---|
| 70 | So, to store a file away, you can do something like this: |
---|
| 71 | |
---|
| 72 | >>> from StringIO import StringIO |
---|
| 73 | >>> from zope.component import getUtility |
---|
| 74 | >>> from waeup.sirp.interfaces import IExtFileStore |
---|
| 75 | >>> store = getUtility(IExtFileStore) |
---|
| 76 | >>> store.createFile('myfile.txt', StringIO('some file content')) |
---|
| 77 | |
---|
| 78 | All you need is a filename and the file-like object containing the |
---|
| 79 | real file data. |
---|
| 80 | |
---|
| 81 | This will store the file somewhere (you shouldn't make too much |
---|
| 82 | assumptions about the real filesystem path here). |
---|
| 83 | |
---|
| 84 | Later, we can get the file back like this: |
---|
| 85 | |
---|
| 86 | >>> store.getFile('myfile.txt') |
---|
| 87 | <open file ...> |
---|
| 88 | |
---|
| 89 | What we get back is a file or file-like object already opened for |
---|
| 90 | reading: |
---|
| 91 | |
---|
| 92 | >>> store.getFile('myfile.txt').read() |
---|
| 93 | 'some file content' |
---|
| 94 | |
---|
| 95 | **Handlers: Special Places for Special Files** |
---|
| 96 | |
---|
| 97 | The file store supports special handling for certain files. For |
---|
| 98 | example we want applicant images to be stored in a different directory |
---|
| 99 | than student images, etc. Because the file store cannot know all |
---|
| 100 | details about these special tratment of certain files, it looks up |
---|
| 101 | helpers (handlers) to provide the information it needs for really |
---|
| 102 | storing the files at the correct location. |
---|
| 103 | |
---|
| 104 | That a file stored in filestore needs special handling can be |
---|
| 105 | indicated by special filenames. These filenames start with a marker like |
---|
| 106 | this:: |
---|
| 107 | |
---|
| 108 | __<MARKER-STRING>__real-filename.jpg |
---|
| 109 | |
---|
| 110 | Please note the double underscores before and after the marker |
---|
| 111 | string. They indicate that all in between is a marker. |
---|
| 112 | |
---|
| 113 | If you store a file in file store with such a filename (we call this a |
---|
| 114 | `file_id` to distuingish it from real world filenames), the file store |
---|
| 115 | will look up a handler for ``<MARKER-STRING>`` and pass it the file to |
---|
| 116 | store. The handler then will return the internal path to store the |
---|
| 117 | file and possibly do additional things as well like validating the |
---|
| 118 | file or similar. |
---|
| 119 | |
---|
| 120 | Examples for such a file store handler can be found in the |
---|
| 121 | :mod:`waeup.sirp.applicants.applicant` module. Please see also the |
---|
| 122 | :class:`DefaultFileStoreHandler` class below for more details. |
---|
| 123 | |
---|
| 124 | The file store looks up handlers by utility lookups: it looks for a |
---|
| 125 | named utiliy providing |
---|
| 126 | :class:`waeup.sirp.interfaces.IFileStoreHandler` and named like the |
---|
| 127 | marker string (without leading/trailing underscores) in lower |
---|
| 128 | case. For example if the file id would be |
---|
| 129 | |
---|
| 130 | ``__IMG_USER__manfred.jpg`` |
---|
| 131 | |
---|
| 132 | then the looked up utility should be registered under name |
---|
| 133 | |
---|
| 134 | ``img_user`` |
---|
| 135 | |
---|
| 136 | and provide :class:`waeup.sirp.interfaces.IFileStoreHandler`. If no |
---|
| 137 | such utility can be found, a default handler is used instead |
---|
| 138 | (see :class:`DefaultFileStoreHandler`). |
---|
| 139 | |
---|
| 140 | **Context Adapters: Knowing Your Family** |
---|
| 141 | |
---|
| 142 | Often the internal filename or file id of a file depends on a |
---|
| 143 | context. For example when we store passport photographs of applicants, |
---|
| 144 | then each image belongs to a certain applicant instance. It is not |
---|
| 145 | difficult to maintain such a connection manually: Say every applicant |
---|
| 146 | had an id, then we could put this id into the filename as well and |
---|
| 147 | would build the filename to store/get the connected file by using that |
---|
| 148 | filename. You then would create filenames of a format like this:: |
---|
| 149 | |
---|
| 150 | __<MARKER-STRING>__applicant0001.jpg |
---|
| 151 | |
---|
| 152 | where ``applicant0001`` would tell exactly which applicant you can see |
---|
| 153 | on the photograph. You notice that the internal file id might have |
---|
| 154 | nothing to do with once uploaded filenames. The id above could have |
---|
| 155 | been uploaded with filename ``manfred.jpg`` but with the new file id |
---|
| 156 | we are able to find the file again later. |
---|
| 157 | |
---|
| 158 | Unfortunately it might soon get boring or cumbersome to retype this |
---|
| 159 | building of filenames for a certain type of context, especially if |
---|
| 160 | your filenames take more of the context into account than only a |
---|
| 161 | simple id. |
---|
| 162 | |
---|
| 163 | Therefore you can define filename building for a context as an adapter |
---|
| 164 | that then could be looked up by other components simply by doing |
---|
| 165 | something like: |
---|
| 166 | |
---|
| 167 | >>> from waeup.sirp.interfaces import IFileStoreNameChooser |
---|
| 168 | >>> file_id = IFileStoreNameChooser(my_context_obj) |
---|
| 169 | |
---|
| 170 | If you later want to change the way file ids are created from a |
---|
| 171 | certain context, you only have to change the adapter implementation |
---|
| 172 | accordingly. |
---|
| 173 | |
---|
| 174 | Note, that this is only a convenience component. You don't have to |
---|
| 175 | define context adapters but it makes things easier for others if you |
---|
| 176 | do, as you don't have to remember the exact file id creation method |
---|
| 177 | all the time and can change things quick and in only one location if |
---|
| 178 | you need to do so. |
---|
| 179 | |
---|
| 180 | Please see the :class:`FileStoreNameChooser` default implementation |
---|
| 181 | below for details. |
---|
| 182 | |
---|
[6519] | 183 | """ |
---|
| 184 | import grok |
---|
| 185 | import os |
---|
[7063] | 186 | import tempfile |
---|
| 187 | from hurry.file import HurryFile |
---|
[6519] | 188 | from hurry.file.interfaces import IFileRetrieval |
---|
[7063] | 189 | from zope.component import queryUtility |
---|
| 190 | from zope.interface import Interface |
---|
| 191 | from waeup.sirp.interfaces import ( |
---|
| 192 | IFileStoreNameChooser, IExtFileStore, IFileStoreHandler,) |
---|
[6519] | 193 | |
---|
[7063] | 194 | class FileStoreNameChooser(grok.Adapter): |
---|
| 195 | """Default file store name chooser. |
---|
[6519] | 196 | |
---|
[7063] | 197 | File store name choosers pick a file id, a string, for a certain |
---|
| 198 | context object. They are normally registered as adapters for a |
---|
| 199 | certain content type and know how to build the file id for this |
---|
| 200 | special type of context. |
---|
[6519] | 201 | |
---|
[7063] | 202 | Provides the :class:`waeup.sirp.interfaces.IFileStoreNameChooser` |
---|
| 203 | interface. |
---|
[6519] | 204 | |
---|
[7063] | 205 | This default file name chosser accepts almost every name as long |
---|
| 206 | as it is a string or unicode object. |
---|
[6519] | 207 | """ |
---|
[7063] | 208 | grok.context(Interface) |
---|
| 209 | grok.implements(IFileStoreNameChooser) |
---|
[6528] | 210 | |
---|
[7066] | 211 | def checkName(self, name, attr=None): |
---|
| 212 | """Check whether a given name (file id) is valid. |
---|
[6519] | 213 | |
---|
[7063] | 214 | Raises a user error if the name is not valid. |
---|
[6519] | 215 | |
---|
[7066] | 216 | For the default file store name chooser any name is valid as |
---|
| 217 | long as it is a string. |
---|
| 218 | |
---|
| 219 | The `attr` is not taken into account here. |
---|
[6519] | 220 | """ |
---|
[7063] | 221 | if isinstance(name, basestring): |
---|
| 222 | return True |
---|
| 223 | return False |
---|
[6519] | 224 | |
---|
[7066] | 225 | def chooseName(self, name, attr=None): |
---|
| 226 | """Choose a unique valid file id for the object. |
---|
[6528] | 227 | |
---|
[7066] | 228 | The given name may be taken into account when choosing the |
---|
| 229 | name (file id). |
---|
[7063] | 230 | |
---|
| 231 | chooseName is expected to always choose a valid name (that |
---|
| 232 | would pass the checkName test) and never raise an error. |
---|
| 233 | |
---|
| 234 | For this default name chooser we return the given name if it |
---|
[7066] | 235 | is valid or ``unknown_file`` else. The `attr` param is not |
---|
| 236 | taken into account here. |
---|
[6528] | 237 | """ |
---|
[7063] | 238 | if self.checkName(name): |
---|
| 239 | return name |
---|
| 240 | return u'unknown_file' |
---|
[6519] | 241 | |
---|
[7063] | 242 | class ExtFileStore(object): |
---|
| 243 | """External file store. |
---|
| 244 | |
---|
| 245 | External file stores are meant to store files 'externally' of the |
---|
| 246 | ZODB, i.e. in filesystem. |
---|
| 247 | |
---|
| 248 | Most important attribute of the external file store is the `root` |
---|
| 249 | path which gives the path to the location where files will be |
---|
| 250 | stored within. |
---|
| 251 | |
---|
| 252 | By default `root` is a ``'media/'`` directory in the root of the |
---|
| 253 | datacenter root of a site. |
---|
| 254 | |
---|
| 255 | The `root` attribute is 'read-only' because you normally don't |
---|
| 256 | want to change this path -- it is dynamic. That means, if you call |
---|
| 257 | the file store from 'within' a site, the root path will be located |
---|
| 258 | inside this site (a :class:`waeup.sirp.University` instance). If |
---|
| 259 | you call it from 'outside' a site some temporary dir (always the |
---|
| 260 | same during lifetime of the file store instance) will be used. The |
---|
| 261 | term 'temporary' tells what you can expect from this path |
---|
| 262 | persistence-wise. |
---|
| 263 | |
---|
| 264 | If you insist, you can pass a root path on initialization to the |
---|
| 265 | constructor but when calling from within a site afterwards, the |
---|
| 266 | site will override your setting for security measures. This way |
---|
| 267 | you can safely use one file store for different sites in a Zope |
---|
| 268 | instance simultanously and files from one site won't show up in |
---|
| 269 | another. |
---|
| 270 | |
---|
| 271 | An ExtFileStore instance is available as a global utility |
---|
| 272 | implementing :class:`waeup.sirp.interfaces.IExtFileStore`. |
---|
| 273 | |
---|
| 274 | To add and retrieve files from the storage, use the appropriate |
---|
| 275 | methods below. |
---|
| 276 | """ |
---|
| 277 | |
---|
| 278 | grok.implements(IExtFileStore) |
---|
| 279 | |
---|
| 280 | _root = None |
---|
| 281 | |
---|
[6519] | 282 | @property |
---|
[7063] | 283 | def root(self): |
---|
| 284 | """Root dir of this storage. |
---|
[6528] | 285 | |
---|
[7063] | 286 | The root dir is a readonly value determined dynamically. It |
---|
| 287 | holds media files for sites or other components. |
---|
| 288 | |
---|
| 289 | If a site is available we return a ``media/`` dir in the |
---|
| 290 | datacenter storage dir. |
---|
| 291 | |
---|
| 292 | Otherwise we create a temporary dir which will be remembered |
---|
| 293 | on next call. |
---|
| 294 | |
---|
| 295 | If a site exists and has a datacenter, it has always |
---|
| 296 | precedence over temporary dirs, also after a temporary |
---|
| 297 | directory was created. |
---|
| 298 | |
---|
| 299 | Please note that retrieving `root` is expensive. You might |
---|
| 300 | want to store a copy once retrieved in order to minimize the |
---|
| 301 | number of calls to `root`. |
---|
| 302 | |
---|
[6528] | 303 | """ |
---|
[7063] | 304 | site = grok.getSite() |
---|
| 305 | if site is not None: |
---|
| 306 | root = os.path.join(site['datacenter'].storage, 'media') |
---|
| 307 | return root |
---|
| 308 | if self._root is None: |
---|
| 309 | self._root = tempfile.mkdtemp() |
---|
| 310 | return self._root |
---|
[6519] | 311 | |
---|
[7063] | 312 | def __init__(self, root=None): |
---|
| 313 | self._root = root |
---|
| 314 | return |
---|
[6528] | 315 | |
---|
[7063] | 316 | def getFile(self, file_id): |
---|
| 317 | """Get a file stored under file ID `file_id`. |
---|
| 318 | |
---|
| 319 | Returns a file already opened for reading. |
---|
| 320 | |
---|
| 321 | If the file cannot be found ``None`` is returned. |
---|
| 322 | |
---|
| 323 | This methods takes into account registered handlers for any |
---|
| 324 | marker put into the file_id. |
---|
| 325 | |
---|
| 326 | .. seealso:: :class:`DefaultFileStoreHandler` |
---|
[6528] | 327 | """ |
---|
[7063] | 328 | marker, filename, base, ext = self.extractMarker(file_id) |
---|
| 329 | handler = queryUtility(IFileStoreHandler, name=marker, |
---|
| 330 | default=DefaultFileStoreHandler()) |
---|
| 331 | path = handler.pathFromFileID(self, self.root, file_id) |
---|
| 332 | if not os.path.exists(path): |
---|
| 333 | return None |
---|
| 334 | fd = open(path, 'rb') |
---|
| 335 | return fd |
---|
[6519] | 336 | |
---|
[7063] | 337 | def getFileByContext(self, context): |
---|
| 338 | """Get a file for given context. |
---|
[6528] | 339 | |
---|
[7063] | 340 | Returns a file already opened for reading. |
---|
| 341 | |
---|
| 342 | If the file cannot be found ``None`` is returned. |
---|
| 343 | |
---|
| 344 | This method takes into account registered handlers and file |
---|
| 345 | name choosers for context types. |
---|
| 346 | |
---|
| 347 | This is a convenience method that internally calls |
---|
| 348 | :meth:`getFile`. |
---|
| 349 | |
---|
| 350 | .. seealso:: :class:`FileStoreNameChooser`, |
---|
| 351 | :class:`DefaultFileStoreHandler`. |
---|
[6528] | 352 | """ |
---|
[7063] | 353 | file_id = IFileStoreNameChooser(context).chooseName() |
---|
| 354 | return self.getFile(file_id) |
---|
[6519] | 355 | |
---|
[7063] | 356 | def createFile(self, filename, f): |
---|
| 357 | """Store a file. |
---|
| 358 | """ |
---|
| 359 | file_id = filename |
---|
| 360 | root = self.root # Calls to self.root are expensive |
---|
| 361 | marker, filename, base, ext = self.extractMarker(file_id) |
---|
| 362 | handler = queryUtility(IFileStoreHandler, name=marker, |
---|
| 363 | default=DefaultFileStoreHandler()) |
---|
| 364 | f, path, file_obj = handler.createFile( |
---|
| 365 | self, root, file_id, filename, f) |
---|
| 366 | dirname = os.path.dirname(path) |
---|
| 367 | if not os.path.exists(dirname): |
---|
| 368 | os.makedirs(dirname, 0755) |
---|
| 369 | open(path, 'wb').write(f.read()) |
---|
| 370 | return file_obj |
---|
| 371 | |
---|
| 372 | def extractMarker(self, file_id): |
---|
| 373 | """split filename into marker, filename, basename, and extension. |
---|
| 374 | |
---|
| 375 | A marker is a leading part of a string of form |
---|
| 376 | ``__MARKERNAME__`` followed by the real filename. This way we |
---|
| 377 | can put markers into a filename to request special processing. |
---|
| 378 | |
---|
| 379 | Returns a quadruple |
---|
| 380 | |
---|
| 381 | ``(marker, filename, basename, extension)`` |
---|
| 382 | |
---|
| 383 | where ``marker`` is the marker in lowercase, filename is the |
---|
| 384 | complete trailing real filename, ``basename`` is the basename |
---|
| 385 | of the filename and ``extension`` the filename extension of |
---|
| 386 | the trailing filename. See examples below. |
---|
| 387 | |
---|
| 388 | Example: |
---|
| 389 | |
---|
| 390 | >>> extractMarker('__MaRkEr__sample.jpg') |
---|
| 391 | ('marker', 'sample.jpg', 'sample', '.jpg') |
---|
| 392 | |
---|
| 393 | If no marker is contained, we assume the whole string to be a |
---|
| 394 | real filename: |
---|
| 395 | |
---|
| 396 | >>> extractMarker('no-marker.txt') |
---|
| 397 | ('', 'no-marker.txt', 'no-marker', '.txt') |
---|
| 398 | |
---|
| 399 | Filenames without extension give an empty extension string: |
---|
| 400 | |
---|
| 401 | >>> extractMarker('no-marker') |
---|
| 402 | ('', 'no-marker', 'no-marker', '') |
---|
| 403 | |
---|
| 404 | """ |
---|
| 405 | if not isinstance(file_id, basestring) or not file_id: |
---|
| 406 | return ('', '', '', '') |
---|
| 407 | parts = file_id.split('__', 2) |
---|
| 408 | marker = '' |
---|
| 409 | if len(parts) == 3 and parts[0] == '': |
---|
| 410 | marker = parts[1].lower() |
---|
| 411 | file_id = parts[2] |
---|
| 412 | basename, ext = os.path.splitext(file_id) |
---|
| 413 | return (marker, file_id, basename, ext) |
---|
| 414 | |
---|
| 415 | grok.global_utility(ExtFileStore, provides=IExtFileStore) |
---|
| 416 | |
---|
| 417 | class DefaultStorage(ExtFileStore): |
---|
| 418 | """Default storage for files. |
---|
| 419 | |
---|
| 420 | Registered globally as utility for |
---|
| 421 | :class:`hurry.file.interfaces.IFileRetrieval`. |
---|
[6519] | 422 | """ |
---|
[7063] | 423 | grok.provides(IFileRetrieval) |
---|
[6519] | 424 | |
---|
[7063] | 425 | grok.global_utility(DefaultStorage, provides=IFileRetrieval) |
---|
[6519] | 426 | |
---|
[7063] | 427 | class DefaultFileStoreHandler(grok.GlobalUtility): |
---|
| 428 | """A default handler for external file store. |
---|
[6519] | 429 | |
---|
[7063] | 430 | This handler is the fallback called by external file stores when |
---|
| 431 | there is no or an unknown marker in the file id. |
---|
[6519] | 432 | |
---|
[7063] | 433 | Registered globally as utility for |
---|
| 434 | :class:`waeup.sirp.interfaces.IFileStoreHandler`. |
---|
| 435 | """ |
---|
| 436 | grok.implements(IFileStoreHandler) |
---|
[6519] | 437 | |
---|
[7063] | 438 | def pathFromFileID(self, store, root, file_id): |
---|
| 439 | """Return the root path of external file store appended by file id. |
---|
| 440 | """ |
---|
| 441 | return os.path.join(root, file_id) |
---|
[6519] | 442 | |
---|
[7063] | 443 | def createFile(self, store, root, filename, file_id, f): |
---|
| 444 | """Infos about what to store exactly and where. |
---|
[6519] | 445 | |
---|
[7063] | 446 | When a file should be handled by an external file storage, it |
---|
| 447 | looks up any handlers (like this one), passes runtime infos |
---|
| 448 | like the storage object, root path, filename, file_id, and the |
---|
| 449 | raw file object itself. |
---|
| 450 | |
---|
| 451 | The handler can then change the file, raise exceptions or |
---|
| 452 | whatever and return the result. |
---|
| 453 | |
---|
| 454 | This handler returns the input file as-is, a path returned by |
---|
| 455 | :meth:`pathFromFileID` and an instance of |
---|
| 456 | :class:`hurry.file.HurryFile` for further operations. |
---|
| 457 | |
---|
| 458 | Please note: although a handler has enough infos to store the |
---|
| 459 | file itself, it should leave that task to the calling file |
---|
| 460 | store. |
---|
| 461 | """ |
---|
| 462 | path = self.pathFromFileID(store, root, file_id) |
---|
| 463 | return f, path, HurryFile(filename, file_id) |
---|