Benchmark

مقارنة Tasreef مع corpus الكتاب المجمّع، bank unseen المجمد، والمرجعيات الخارجية الأقرب.
← التحليل الوظيفي
Corpus الكتاب
Exact
التغطية المرجعية مستقرة.
Live bank
179
Bank unseen مجمد مع exact category match 0.989.
Generated corpus
9200
المصدر الرئيسي لل weak supervision.
Weak categories
25
الفئات التي لا تزال أقل من 1.0 في واحد أو أكثر من المقاييس.

Morphology-first comparison

This block keeps the external comparison honest: CAMeL-style tools are useful for morphology / lemma / POS, but they do not cover Tasreef’s functional sentence roles or pragmatic placement.
Tasreef
Functional syntax
Book-grounded sentence roles, slot structure, pragmatic layer, structural layer, and a rule-first reference analyzer.
CAMeL-style tools
Morphology / lemma / POS
Strong for token-level morphology and surface analysis, especially closed-class and lemma reliability, but not a functional-syntax system.
UD Arabic-PADT
External syntax training
Useful syntax comparison target and training source for external generalization, but it is dependency-oriented rather than book-grounded role-oriented.
UD Arabic-PUD
External hold-out
Held-out comparison set for unseen syntax generalization and regression checking.

مقارنة benchmark

الهدف الدور الحالة المقياس ملاحظة
Grouped book corpus Reference layer Exact 1.0 across category/token/slot/semantic/syntactic/pragmatic/structural Stable source of truth for the analyzer.
Frozen live unseen bank Unseen regression set Frozen 179 sentences; exact category match 0.989 Held out for regression, not training.
Generated reference corpus Weak supervision source Audited 9200 sentences; keep 8401; review 399; discard 400 Used to train sentence and token baselines.
Sentence-family baseline Learned comparison Trained test accuracy 0.9155; macro F1 0.916 Measures weak-label learnability at sentence level.
Token-level BiLSTM-CRF Learned comparison Trained semantic 0.9214; syntactic 0.9208; pragmatic 0.9469 Measures the token stack over weak supervision.
UD Arabic-PADT External syntax comparison External Public dependency treebank for morphology and syntax comparison. Benchmark target for transfer and external validation.
UD Arabic-PUD External unseen comparison External Held-out-style Arabic dependency corpus for generalization comparison. Useful for broader unseen-text sanity checks.
CAMeL-style morphology tools Morphology-first baseline External Reference point for lemma / POS / morphology, not functional syntax. Good comparison for the lemmatizer surface only.

التميّز الوظيفي

Functional sentence roles
Tasreef: Yes
UD-PADT: No
UD-PUD: No
CAMeL-style: Partial
Slot-based structure
Tasreef: Yes
UD-PADT: No
UD-PUD: No
CAMeL-style: No
Pragmatic layer
Tasreef: Yes
UD-PADT: No
UD-PUD: No
CAMeL-style: No
Book-grounded reference layer
Tasreef: Yes
UD-PADT: No
UD-PUD: No
CAMeL-style: No
Morphology / lemma focus
Tasreef: Yes
UD-PADT: Yes
UD-PUD: Yes
CAMeL-style: Yes
Tasreef يختلف عن المقارنات الخارجية في أنه يحافظ على طبقة مبنية على الكتاب، مع slots وقراءة تداولية صريحة. المقارنات الخارجية هنا هدفها benchmarking، لا إعادة تعريف المقياس.

ملخص القياس

Sentence accuracy
0.758
مجمّع corpus الكتاب.
Token accuracy
0.121
الطبقة token.
Slot accuracy
0.838
التموضع البنيوي.
Pragmatic accuracy
0.174
الطبقة التداولية.