Mission Database: Difference between revisions
Nbohr1more (talk | contribs) |
Nbohr1more (talk | contribs) |
||
Line 127: | Line 127: | ||
<br>Make sure no files or folders have spaces or special characters in their names | <br>Make sure no files or folders have spaces or special characters in their names | ||
<br>Navigate to the parent directory of the querty folder and right-click querty then choose commit | <br>Navigate to the parent directory of the querty folder and right-click querty then choose commit | ||
<br>If the mission author removed files or you are unsure, you will need to compare the existing copy to the new version prior to merging | <br><br>If the mission author removed files or you are unsure, you will need to compare the existing copy to the new version prior to merging | ||
<br>For Windows users the program WinMerge is a good tool, Linux users can use "diff -r folderA, folderB" | <br>For Windows users the program WinMerge is a good tool, Linux users can use "diff -r folderA, folderB" | ||
<br>Identify additions and removals then overwrite the contents of the query.pk4_dir then perform SVN Add and SVN Delete actions to all relevant items | <br>Identify additions and removals then overwrite the contents of the query.pk4_dir then perform SVN Add and SVN Delete actions to all relevant items |
Revision as of 22:42, 16 July 2024
All released missions are stored in FM database, which defines what is available in mission downloader in-game.
Requirements
In order for a mission to be included in the database, it must satisfy the following conditions:
- It has gone through beta-testing by several forum members who did not take part in its creation.
- As a rare exceptional case, a mission can be rejected from the database if the majority of beta-testers come to conclusion that it is unplayable or its quality is way too low.
- Does not contain any questionable material, potential intellectual property infridgements. Including:
- using names from the original Thief games
- using assets from other games
- having content which is so bad that it is forbidden even on our forums (link to forum rules)
A very few missions are installed along with the game, like Saint Lucia or Training Mission. These official missions are considered to be part of the game, and are not included in the database.
SVN
Starting from 2021, the FM database is stored in SVN repository (#5551). Only TDM team members have access to this SVN.
Here is the SVN address:
But please don't rush to checkout the whole repo yet!
Checkout everything
Yes, you can simply copy the link above to SVN Checkout and get single working copy with all the FMs. But keep in mind that you will have to download ~10 GB of data and the working copy will take ~20 GB of space. In most cases you don't need everything in order to work with the database.
The typical reasons to checkout the whole repo are:
- You want to run automated search or tests over all released FMs.
- You are regular committer of the FM database, added a lot of new missions and updates, so the investment is worth it.
Checkout as needed
Instead of checking out the whole repo, you can checkout only the few FMs you are going to modify. This is the recommended approach.
With this approach, you need the Repo-browser feature of TortoiseSVN. It allows to look through all the directories and files on remote SVN without checking it out first.
The directory structure of SVN is show on the picture. Most importantly, all information about FM with internal name "qwerty" is stored in fms/qwerty subdirectory of the repo. You can checkout only this directory in order to work with FM, just put this checkout address:
Or find the FM directory in Repo-browser, right-click and select Checkout.
Detailed instructions in the rest of the article assume this approach.
Add New Mission
Before adding new FM, negotiate internal name with FM author. It must be different from names of all existing FMs, consist of only lowercase letters and digits, be rather short (aim for 10-20 letters). Among many words in the mission title, prefer rare words and proper nouns over common words when composing the internal name.
Open SVN in Repo-browser, find fms directory which contains all the FMs. Right-click on it and choose Add folder in the context menu. Then enter the internal name of FM as the name of the new directory and proceed with commit. Now that the directory has been created, you can checkout it: find the directory in Repo-browser, right-click it and Checkout.
The second step is to upload pk4 files.
Create two sub-directories under the mission checkout directory. One named screenshots and another named mission-name.pk4_dir
( replace mission-name with your preferred name )
Unpack all the pk4 files of FM pack into the mission-name.pk4_dir sub-folder of the working copy directory (i.e. checkout directory).
Open the directory in Windows Explorer, right-click the parent folder and select SVN commit.
Select all files with Shift, right-click and select Add.
Also set checkboxes for these files.
Write commit message in the text area above: it should start with internal name of FM in brackets.
When everything is done, hit OK to do the actual commit.
While the pk4 files are already in the repository, they are not yet visible in the database. A mission is only added when its directory contains fminfo.xml file, so now you need to add it. You can take this file from another FM (find it in Repo-browser, right-click, Open with, select text editor), and adjust it for the FM being added. Here is explanation for some fields:
- internalName defines name of the directory and pk4 file.
- title is the name seen by players in-game.
- author is one or several people who made the mission.
- releaseDate shows when the very first version of the mission was added.
- type is multi if mission contains several playable .map files (i.e. is campaign), and single otherwise.
- size is size of the main pk4 file in megabytes displayed to users.
- version is natural number used by in-game downloader to decide whether update is available or not. Starts with 1.
- description contains text displayed in in-game downloader when player inspects mission details.
- mainPack points to the main pk4 file of the mission. Note that the name of file is fully determined by internal name.
- localisationPack points to _l10n.pk4 file if it exists.
Note that XML cannot directly contain some characters, thus they must be escaped:
- Ampersand (&) quotes (" or ') and angle brackets (< or >): reference
- Line break symbol can be inserted as according to reference
When you have created fminfo.xml file, double-check that all properties are correct. Then use SVN to add and commit the file, just like you did with pk4 file.
If you did something wrong (most likely), then you will see an error saying that "Commit blocked by pre-commit hook". A long stacktrace from Python script is included, and the meaningful message should be at the end of it. Typically, it is either XML validation error saying that something in fminfo.xml is wrong, or a message from some custom failed check. You need to fix the errors and try to commit again --- until you manage to commit successfully.
As the last step, create subdirectory named screenshots in the working copy directory. Put screenshots in .jpg or .png format into the directory: they will be displayed in in-game mission downloader. Then add and commit all the screenshot files into the repository, same way as you did for pk4 and xml files.
Update Mission
This section covers the case if mission has already been released, but new version should be uploaded.
First of all, make sure you have up-to-date working copy of the FM directory.
If you already have working copy, do right-click and SVN Update in it in Windows Explorer.
If you don't have it yet, then open Repo-browser, find the directory named by FM's internal name, right-click and select Checkout.
If you don't know the internal name, you can learn it like this: install the FM in the game,
then look what is written in the currentfm.txt file.
Suppose internal name is "qwerty".
If you know the author didn't remove any files from the new version, just extract the contents of the new PK4 to the query.pk4_dir and overwrite the contents.
Then browse the sub-folders for any un-versioned new items ( usually will have a question mark indicator ) and right-click them and select SVN Add
Then update the version number in the FM XML file as well as the file size
Make sure no files or folders have spaces or special characters in their names
Navigate to the parent directory of the querty folder and right-click querty then choose commit
If the mission author removed files or you are unsure, you will need to compare the existing copy to the new version prior to merging
For Windows users the program WinMerge is a good tool, Linux users can use "diff -r folderA, folderB"
Identify additions and removals then overwrite the contents of the query.pk4_dir then perform SVN Add and SVN Delete actions to all relevant items
Then update the FM XML and commit as described above
Storage Concerns
We store all missions in SVN repository, thus every version of every FM is saved forever. While total size of all FMs can be 10 GB, the SVN repository can be larger due to storing full history, especially if large FMs are updated many times. As of 2021, it is not clear yet how bad things will become. Most likely the repo won't grow too large, but it's better to be careful. In order to decide how to make history smaller, we should first understand how is SVN repository stored on the server.
Xdelta and Zip Format
SVN history is a series of revisions. For every revision, SVN stores the diff between the previous version and the new one for every modified file. So when we commit an update to pk4 file, the size of SVN repository grows by the size of the diff on pk4 file. In the worst case the diff can be as large as the new version of pk4 file. Unfortunately, such worst-case outcome easily happen even if only a few files inside archive were modified.
Pk4 file is an ordinary zip archive, so it is stored in Zip format. All files are stored sequentally inside the archive file, one after another. Every file inside zip archive is compressed independently of all the other files, and occupies some subsegment of the file. If some file was not changed and was not recompressed, then the new archive contains exactly the same bytes for this file as the old archive. In theory, a perfect diff algorithm can detect it, and avoid including any data for such "not-changed" files into the diff.
In SVN, diff between revisions is computed using xdelta algorithm with search window limited to 100 KB. Due to the very limited search window, the algorithm cannot reliably detect that files inside the old archive are reused in the new one. Changing the order of files inside zip archive or removing files larger than 100 KB are enough to completely break the diff algorithm, resulting in a maximum-size diff. That's why even using 7-zip to modify the old archive does not guarantee that your commit will produce small diff. In fact, maximum-size diff is almost guaranteed if you remove at least one file of size larger than 100 KB (same can also happen for file modification).
Pk4diff Optimization
We have a special tool for "optimizing" pk4 file to reduce diff size. This tool inspects the old and the new versions of the archive and finds which files have equal contents. Then it repacks the archive in the following way:
- Take old version of the archive.
- Rename all files which were modified or removed to __trash__/trashN._tbin.
- Append files which were added or modified to the end of the archive.
The resulting pk4 archive as almost exactly the same as the old one, with new data appended at the end. It is almost certain that SVN will produce diff file which only contains the differences (the appended data). The downside is: "optimized" pk4 file is slightly larger because it still stores the old data as "trash".
In order to run the optimizer script, Python 3 must be installed. Of course, SVN must be available in command line (for TortoiseSVN, make sure to check "command line client tools" during installation). The tool is located in devel/pk4diff/bin in the assets repo and consists of Python script, pk4diff executable, and xdelta3 executable. The easiest way to run it is to copy all three files into the directory with pk4 file (which must be in SVN working copy), then execute in command line:
python pk4diff.py --optimize qwerty.pk4
Here is the sample output:
CMD: svn export hhta.pk4@BASE __tmp_clean__.pk4 A __tmp_clean__.pk4 Export complete. CMD: pk4diffexe __tmp_clean__.pk4 hhta.pk4 Added size: 6676263 Removed size: 33085296 CMD: xdelta3 -e -f -B 524288 -W 524288 -s __tmp_clean__.pk4 hhta.pk4 __tmp_diff__.bin Xdelta diff size: 517134007 CMD: pk4diffexe __tmp_clean__.pk4 hhta.pk4 __tmp_optimized__.pk4 Added size: 6676263 Removed size: 33085296 CMD: xdelta3 -e -f -B 524288 -W 524288 -s __tmp_clean__.pk4 __tmp_optimized__.pk4 __tmp_diff__.bin Xdelta diff size: 6712187 Added portion of dead data: 5.885588% Replacing pk4 file with optimized file
All lines starting with "CMD:" shows running some program with some parameters. The script works like this:
- The procedure starts with exporting clean version of hhta.pk4 using SVN command.
- Then pk4diffexe is run: it displays how many bytes are added/removed in the update. The full FM package is about 500 MB, so the changes are pretty small in this example.
- xdelta3 is run to estimate initial size of the diff. Obviously, it is maximum-size diff in this example (+ 500 MB to repo size).
- pk4diffexe is run again, but now it produces an optimized pk4 file.
- xdelta3 is run again on the optimized pk4 file. The diff size becomes about 6 MB, so the optimization has reduced diff a lot.
- The optimized pk4 file contains some trash data, and we are told which portion of the optimized pk4 is trash. It's only 6% in this case (33 MB).
- Since portion of trash is lower than 10%, the pk4 file is replaced with the optimized one. If there is too much trash, then optimized pk4 is simply deleted with a different message.
After running the program, the hhta.pk4 file is modified: now it is the optimized version. Also, there is file hhta.pk4.old nearby: this is the copy of modified pk4 before optimization, in case you decide to restore it back. All that is left is to delete the .old file and commit modified pk4 to SVN.
Note that xdelta3 provides only rough estimate on the diff size, because 1) SVN uses xdelta 1 instead of xdelta 3, and 2) SVN uses window size = 100 KB, while command-line xdelta3 does not allow windows size smaller than 512 KB. However, the diff size displayed by the script should be very close to diff size on the SVN server in most cases.
For programmers: the source code for pk4diffexe is located in devel/pk4diff/src. It requires CMake and Conan to be built. File conan_install.bat contains commands used to build it.
Trash
Optimized pk4 file contains "trash" files. They are located in __trash__ directory and have filenames trashKKK._tbin. Since they have weird extension, they should never affect how TDM game works.
If the total amount of trash is less than 100 KB, then you can safely delete it using 7-zip program before committing update. When the total amount of trash is more than 100 KB, then SVN diff algorithm will be broken if you delete it, most likely resulting in maximum-size diff. Indeed, we should control amount of trash in order to achieve balance between reducing SVN storage on server and reducing download traffic and storage on players' machines. That's why pk4diff script only accepts optimized package if amount of trash is lower than 10%.
References
- Thread on developer subforum: https://forums.thedarkmod.com/index.php?/topic/20624-store-missions-archive-in-svn/