Serialized Statutes
Serial Pattern
Bases: BasePattern
A Rule can be extracted from a SerialPattern. The word serial
is employed because the documents representing rules are numbered consecutively.
Each serial pattern refers to a Statute Category,
e.g. RA, CA, etc. matched with a
Serial Identifier.
Since a SerialPattern inherits from a BasePattern, it includes
other fields declared in the latter model: matches and excludes bringing the
total number of fields to 5, viz.:
| Field | Description | Example |
|---|---|---|
cat |
Statute Category |
StatuteSerialCategory.RepublicAct |
regex_bases |
How do we pattern the category name? | ["r.a. no.", "Rep. Act. No."] |
regex_serials |
What digits are allowed | ["386", "11114"] |
matches |
Usable in parametized tests to determine whether the pattern declared matches the samples | ["Republic Act No. 7160", "R.A. 386 and 7160" ] |
excludes |
Usable in parametized tests to determine that the full pattern will not match | ["Republic Act No. 7160:", "RA 9337-"] |
Source code in statute_patterns/models.py
Attributes
lines: Iterator[str]
property
Each regex string produced matches the serial rule. Note the line break
needs to be retained so that when printing @regex, the result is organized.
Serial Pattern Collection
Bases: BaseCollection
Each category-based, serial-numbered, legal title will have a regex string, e.g. Republic Act is a category, a serial number for this category is 386 representing the Philippine Civil Code.
Source code in statute_patterns/models.py
Functions
extract_rules(text)
Each m, a python Match object, represents a
serial pattern category with possible ambiguous identifier found.
So running m.group(0) should yield the entire text of the
match which consists of (a) the definitive category;
and (b) the ambiguous identifier.
The identifier is ambiguous because it may be a compound one, e.g. 'Presidential Decree No. 1 and 2'. In this case, there should be 2 matches produced not just one.
This function splits the identifier by commas , and the
word and to get the individual component identifiers.