Context navigation

source: main/waeup.sirp/branches/unique-index/README.txt @ 7044

Last change on this file since 7044 was 6211, checked in by uli, 14 years ago
Provide a unique field index for catalogs.
File size: 4.7 KB

Line
1	Field Indexes
2	=============
3
4	Field indexes index orderable values. Note that they don't check for
5	orderability. That is, all of the values added to the index must be
6	orderable together. It is up to applications to provide only mutually
7	orderable values.
8
9	>>> from zope.index.field import FieldIndex
10
11	>>> index = FieldIndex()
12	>>> index.index_doc(0, 6)
13	>>> index.index_doc(1, 26)
14	>>> index.index_doc(2, 94)
15	>>> index.index_doc(3, 68)
16	>>> index.index_doc(4, 30)
17	>>> index.index_doc(5, 68)
18	>>> index.index_doc(6, 82)
19	>>> index.index_doc(7, 30)
20	>>> index.index_doc(8, 43)
21	>>> index.index_doc(9, 15)
22
23	Field indexes are searched with apply. The argument is a tuple
24	with a minimum and maximum value:
25
26	>>> index.apply((30, 70))
27	IFSet([3, 4, 5, 7, 8])
28
29	A common mistake is to pass a single value. If anything other than a
30	two-tuple is passed, a type error is raised:
31
32	>>> index.apply('hi')
33	Traceback (most recent call last):
34	...
35	TypeError: ('two-length tuple expected', 'hi')
36
37
38	Open-ended ranges can be provided by provinding None as an end point:
39
40	>>> index.apply((30, None))
41	IFSet([2, 3, 4, 5, 6, 7, 8])
42
43	>>> index.apply((None, 70))
44	IFSet([0, 1, 3, 4, 5, 7, 8, 9])
45
46	>>> index.apply((None, None))
47	IFSet([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
48
49	To do an exact value search, supply equal minimum and maximum values:
50
51	>>> index.apply((30, 30))
52	IFSet([4, 7])
53
54	>>> index.apply((70, 70))
55	IFSet([])
56
57	Field indexes support basic statistics:
58
59	>>> index.documentCount()
60	10
61	>>> index.wordCount()
62	8
63
64	Documents can be reindexed:
65
66	>>> index.apply((15, 15))
67	IFSet([9])
68	>>> index.index_doc(9, 14)
69
70	>>> index.apply((15, 15))
71	IFSet([])
72	>>> index.apply((14, 14))
73	IFSet([9])
74
75	Documents can be unindexed:
76
77	>>> index.unindex_doc(7)
78	>>> index.documentCount()
79	9
80	>>> index.wordCount()
81	8
82	>>> index.unindex_doc(8)
83	>>> index.documentCount()
84	8
85	>>> index.wordCount()
86	7
87
88	>>> index.apply((30, 70))
89	IFSet([3, 4, 5])
90
91	Unindexing a document id that isn't present is ignored:
92
93	>>> index.unindex_doc(8)
94	>>> index.unindex_doc(80)
95	>>> index.documentCount()
96	8
97	>>> index.wordCount()
98	7
99
100	We can also clear the index entirely:
101
102	>>> index.clear()
103	>>> index.documentCount()
104	0
105	>>> index.wordCount()
106	0
107
108	>>> index.apply((30, 70))
109	IFSet([])
110
111	Sorting
112	-------
113
114	Field indexes also implement IIndexSort interface that
115	provides a method for sorting document ids by their indexed
116	values.
117
118	>>> index.index_doc(1, 9)
119	>>> index.index_doc(2, 8)
120	>>> index.index_doc(3, 7)
121	>>> index.index_doc(4, 6)
122	>>> index.index_doc(5, 5)
123	>>> index.index_doc(6, 4)
124	>>> index.index_doc(7, 3)
125	>>> index.index_doc(8, 2)
126	>>> index.index_doc(9, 1)
127
128	>>> list(index.sort([4, 2, 9, 7, 3, 1, 5]))
129	[9, 7, 5, 4, 3, 2, 1]
130
131	We can also specify the ``reverse`` argument to reverse results:
132
133	>>> list(index.sort([4, 2, 9, 7, 3, 1, 5], reverse=True))
134	[1, 2, 3, 4, 5, 7, 9]
135
136	And as per IIndexSort, we can limit results by specifying the ``limit``
137	argument:
138
139	>>> list(index.sort([4, 2, 9, 7, 3, 1, 5], limit=3))
140	[9, 7, 5]
141
142	If we pass an id that is not indexed by this index, it won't be included
143	in the result.
144
145	>>> list(index.sort([2, 10]))
146	[2]
147
148	>>> index.clear()
149
150	Bugfix testing:
151	---------------
152	Happened at least once that the value dropped out of the forward index,
153	but the index still contains the object, the unindex broke
154
155	>>> index.index_doc(0, 6)
156	>>> index.index_doc(1, 26)
157	>>> index.index_doc(2, 94)
158	>>> index.index_doc(3, 68)
159	>>> index.index_doc(4, 30)
160	>>> index.index_doc(5, 68)
161	>>> index.index_doc(6, 82)
162	>>> index.index_doc(7, 30)
163	>>> index.index_doc(8, 43)
164	>>> index.index_doc(9, 15)
165
166	>>> index.apply((None, None))
167	IFSet([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
168
169	Here is the damage:
170
171	>>> del index._fwd_index[68]
172
173	Unindex should succeed:
174
175	>>> index.unindex_doc(5)
176	>>> index.unindex_doc(3)
177
178	>>> index.apply((None, None))
179	IFSet([0, 1, 2, 4, 6, 7, 8, 9])
180
181
182	Optimizations
183	-------------
184
185	There is an optimization which makes sure that nothing is changed in the
186	internal data structures if the value of the ducument was not changed.
187
188	To test this optimization we patch the index instance to make sure unindex_doc
189	is not called.
190
191	>>> def unindex_doc(doc_id):
192	... raise KeyError
193	>>> index.unindex_doc = unindex_doc
194
195	Now we get a KeyError if we try to change the value.
196
197	>>> index.index_doc(9, 14)
198	Traceback (most recent call last):
199	...
200	KeyError
201
202	Leaving the value unchange doesn't call unindex_doc.
203
204	>>> index.index_doc(9, 15)
205	>>> index.apply((15, 15))
206	IFSet([9])

Note: See TracBrowser for help on using the repository browser.

Download in other formats: