| Grouped book corpus |
Reference layer |
Exact |
1.0 across category/token/slot/semantic/syntactic/pragmatic/structural |
Stable source of truth for the analyzer. |
| Frozen live unseen bank |
Unseen regression set |
Frozen |
179 sentences; exact category match 0.989 |
Held out for regression, not training. |
| Generated reference corpus |
Weak supervision source |
Audited |
9200 sentences; keep 8401; review 399; discard 400 |
Used to train sentence and token baselines. |
| Sentence-family baseline |
Learned comparison |
Trained |
test accuracy 0.9155; macro F1 0.916 |
Measures weak-label learnability at sentence level. |
| Token-level BiLSTM-CRF |
Learned comparison |
Trained |
semantic 0.9214; syntactic 0.9208; pragmatic 0.9469 |
Measures the token stack over weak supervision. |
| UD Arabic-PADT |
External syntax comparison |
External |
Public dependency treebank for morphology and syntax comparison. |
Benchmark target for transfer and external validation. |
| UD Arabic-PUD |
External unseen comparison |
External |
Held-out-style Arabic dependency corpus for generalization comparison. |
Useful for broader unseen-text sanity checks. |
| CAMeL-style morphology tools |
Morphology-first baseline |
External |
Reference point for lemma / POS / morphology, not functional syntax. |
Good comparison for the lemmatizer surface only. |