الخط الأساس المتعلم

هذه الصفحة تعطي الخلاصة القابلة للتنزيل للـ sentence-family baseline و token-level BiLSTM-CRF المبنيين على weak supervision من المحلل الحالي.
← benchmark
Sentence baseline
0.9155
test accuracy · macro F1 0.916
Token baseline
0.9214
semantic / syntactic / pragmatic
Token joint
0.7738
sequence-level exactness
Train split
6514/814/814
train / dev / test

Sentence-family baseline

Classifier over the generated weak labels. This is the sentence-level learned comparison, not the final analyzer.
القسمالقيمةملاحظة
Model/Users/al-hmouz/Documents/Arabic_Tasreef/functional_syntax/models/generated_reference_sentence_classifier.ptsaved checkpoint
Train rows6719split summary
Dev rows840split summary
Test rows840split summary
Train accuracy0.9999overfit check
Dev accuracy0.9226selection metric
Test accuracy0.9155published baseline
Test macro F10.916balanced view

Training history

EpochTrain lossTrain accDev lossDev accDev macro F1
1 0.9376 0.7254 0.4107 0.8524 0.8564
2 0.2192 0.8954 0.3516 0.8655 0.863
3 0.0876 0.9495 0.3035 0.8964 0.9066
4 0.0337 0.9772 0.2801 0.9226 0.9209
5 0.0145 0.9905 0.2878 0.9262 0.9267
6 0.0079 0.9957 0.3064 0.925 0.9238
7 0.0066 0.997 0.3034 0.9238 0.9232
8 0.0029 0.9993 0.3017 0.9226 0.924

Token BiLSTM-CRF baseline

Token-level weak-supervision model. Structural labels remain rule-derived; the learned layer covers semantic, syntactic, and pragmatic tags.
القسمالقيمةملاحظة
Model/Users/al-hmouz/Documents/Arabic_Tasreef/functional_syntax/models/generated_reference_token_bilstm_crf.ptsaved checkpoint
Train sentences6719split summary
Dev sentences840split summary
Test sentences840split summary
Train tokens26590token corpus
Dev tokens3352token corpus
Test tokens3332token corpus
Vocab size3328token vocabulary
Test semantic accuracy0.9214published baseline
Test syntactic accuracy0.9208published baseline
Test pragmatic accuracy0.9469published baseline
Test joint accuracy0.7738sequence-level exactness

Best-state history

EpochTrain lossDev lossDev semDev synDev pragDev joint
1 2.759 1.3413 0.8699 0.8699 0.9045 0.575
2 1.1177 1.0195 0.8965 0.8974 0.9266 0.6738
3 0.7277 0.9254 0.9081 0.9078 0.9382 0.7131
4 0.4847 0.9633 0.9126 0.9132 0.9382 0.7238
5 0.3237 0.9584 0.9147 0.915 0.9394 0.7429
6 0.2203 1.0079 0.9192 0.9186 0.9436 0.7512

Positioning

The learned baselines are supporting evidence, not the product core. The reference layer stays book-grounded and rule-first. The learned models show that the weak labels are learnable at sentence and token level.