add training code

Objective: Expose the model to Pd syntax and structure.
Method: I continued-pretraining the model using qLoRA on a large dataset of patches.
Dataset: <a href="https://huggingface.co/datasets/ParZiVal04/Pd-patches-14k-dataset" rel="nofollow">ParZiVal04/Pd-patches-14k-dataset <span class="ambiguous-code-point" data-tooltip-content="– [U+2013] can be confused with - [U+002D]"> – Contains approximately 14,000 Pd patches.
Training Duration: ~20 hours on a Tesla T4 GPU .
2026-03-29 10:27:41 +02:00 · 2024-10-17 22:37:48 +05:30 · 2024-08-15 22:59:31 +05:30 · 2024-07-10 01:48:41 +05:30 · 2024-07-30 14:24:40 +05:30 · 2024-09-17 10:45:39 +05:30
diff --git a/alternate_patchfile_representations b/alternate_patchfile_representations