Commit Graph

62 Commits

Author SHA1 Message Date
David Korth
0b682acbd6 main.cpp: Missing _T(). 2019-06-12 00:29:46 -04:00
David Korth
ca946f2810 main.cpp: Print the TinyXML2 error strings and handle TinyXML2 error codes correctly.
This will make it easier to diagnose errors when parsing XML files.
2019-06-10 00:41:26 -04:00
David Korth
92314dc503 Mst::saveMST(): Fix a segmentation fault if a string has no name set.
This is usually an error caused by a missing string index. I might end
up removing string indexes later, since I don't think it's actually
needed. Need to test reordering strings by changing indexes but keeping
the same string names later.

This bug was repoted by Jotaro Powered.
2019-06-02 00:14:52 -04:00
David Korth
2b94f9ff14 Mst::saveMST(): Moved u16string msg_text outside of the loop.
This eliminates having to create and destroy the u16string every
iteration.

Added msg_text.clear() to the case where byteswapping is needed but the
string is empty to ensure that it gets cleared. We were previously
depending on the string being reinitialized on every iteration for the
string to be cleared.
2019-05-31 01:46:56 -04:00
David Korth
8164823f9b Mst::saveXML(const TCHAR*): TODO: Delete the XML file on error? 2019-05-31 01:43:04 -04:00
David Korth
bb68f00dfe Mst::saveMST(): Allow empty text messages, even though it doesn't make sense.
Better to save an empty string than to fail for seemingly no reason.
2019-05-31 01:40:14 -04:00
David Korth
8a80457806 Mst: Work around a TinyXML2 bug where if a node's text is all spaces, it's interpreted as empty.
This broke msg_dl0030_title.e.mst, which has some strings that are all
spaces.

This bug was reported by Jotaro Powered.
2019-05-31 01:37:01 -04:00
David Korth
b2ab0dcb29 doc/MST_format.md: Updated for recent changes. 2019-05-29 23:25:50 -04:00
David Korth
54f163287d Mst: Removed all of the unnecessary differential offset table code.
The generated differential offset table matches the original now, so we
don't have to save the original table in the XML anymore.
2019-05-29 23:05:36 -04:00
David Korth
9c6c385e63 Mst::saveMST(): Generate the differential offset table, again.
Always write three offsets per string, though only write two entries in
the differential offset table if the placeholder name isn't present.

vOffsetTblType is no longer necessary, since the string types are
inferred by the field names in WTXT_MsgPointer.

Removed the extra zero DWORD at the end of the offset table, since
this was actually the placeholder offset field of the last entry.

msg_mainmenu.e.mst converts to XML and back to MST perfectly,
as do msg_town_mission_shadow.j.mst and msg_town_mission_sonic.e.mst

common.h: Added for ASSERT_STRUCT() to ensure we have proper struct
size assertions. Will use it for ALIGN() later.
2019-05-29 22:57:10 -04:00
David Korth
19f15a509a Mst::loadXML(): Load the placeholder name attribute. 2019-05-29 21:50:20 -04:00
David Korth
6b364f56df Mst::dump(): Escape the string text.
Mst::escape(const u16string&): Escape newlines and form feeds properly.
2019-05-29 21:42:59 -04:00
David Korth
39d0260bff Mst::dump(): Print placeholder names. 2019-05-29 21:40:09 -04:00
David Korth
e065edf40d Mst: Use the offset table directly when loading; offsets are in triplets.
The three offsets:
- String name offset (Shift-JIS)
- String text offset (UTF-16)
- Placeholder name offset (Shift-JIS, may be zero)

Placeholder name offset is usually 0. This is still present in the
offset table, but the differential offset table *skips* it by using
'B'. The differential offset table is useless to us, but it's required
in order for Sonic'06 to parse it correctly and not crash.

saveXML(): The placeholder string is now saved as an attribute in the
relevant message instead of as its own message.

TODO: Update loadXML() and saveMST().
2019-05-29 21:37:59 -04:00
David Korth
0e92d2c704 Mst::unescapeDiffOffTbl(): Don't add an extra DWORD of 0s if it doesn't end in 0.
msg_town_mission_shadow.j.mst ends on a DWORD boundary and doesn't have
extra 0s, so this resulted in a non-matching file when converting from
MST to XML and back to MST.
2019-05-29 21:20:11 -04:00
David Korth
bfe8a1e424 Mst::saveXML(): Save the per-message differential offsets in each <message> element.
Also for the initial WTXT/message count values in <mst06>.

For msg_mainmenu.e.xml: It seems that "AA" appears for messages that have
a button placeholder ('$'), and "A" appears for the button image string
immediately after that message. I might be able to use this information
later to properly generate the differential offset table instead of
including it in the XML file.

TODO: Remove <diffOffTbl> and use the per-message values.
2019-05-28 20:27:29 -04:00
David Korth
c44f17caa5 Mst::saveMST(): Deduplicate string names.
If a string name is the same as the string table name, it's deduplicated.

msg_mainmenu.e.mst can now be converted to XML and back to MST, and the
resulting MST is identical to the original.

TODO: Do we need to deduplicate all strings, or just the string table name?
2019-05-27 21:07:31 -04:00
David Korth
6b019f67a9 Mst::saveMST(): Use the existing differential offset table.
The resulting MST looks much closer to the original, though it seems
that the original MST tool used string deduplication, so the new one
has extra copies of strings. I don't know if this will break the game.
2019-05-27 21:00:24 -04:00
David Korth
2cb243ed3e Mst::getNextDiffOff(): Split this out of loadMST().
This will be used by saveMST() when using the original differential
offset table.
2019-05-27 20:32:31 -04:00
David Korth
21e3ca7646 Mst: Use '\0' for 0 bytes in the differential offset table; make sure the table ends in '\0'.
If it doesn't end in '\0', add another four '\0' bytes.
2019-05-27 20:15:59 -04:00
David Korth
a6bcd60216 Mst::loadXML(): Load the differential offset table.
unescapeDiffOffTbl(): Unescape the XML-compatible differential offset
table and load it into a std::vector<uint8_t>.

TODO: Make use of the differential offset table in saveMST().
2019-05-27 20:08:21 -04:00
David Korth
0721a0213c Mst::escapeDiffOffTbl(): s/diffTbl/diffOffTbl/g 2019-05-27 19:55:27 -04:00
David Korth
f7ce4f14e7 Mst::escapeDiffOffTbl(): Handle '\n', '\f' and '\\'.
Similar to the regular escaped() function.

'\\' is especially needed, since otherwise the escapes wouldn't
work correctly.
2019-05-27 19:52:43 -04:00
David Korth
f570aff7a0 Mst: Keep the differential offset table in memory; save it to XML files.
It seems that Sonic'06 doesn't like our custom-built differential
offset table, so we'll need to keep the original. This means that
we can't create our own string table from scratch, but that isn't
exactly a very common operation.

(...unless we figure out exactly what it doesn't like...)

Fixed some confusion between offset table names. The first table should
be the offset table, not the differential table.

TODO:
- loadXML(): Load the differential offset table.
- saveMST(): Use the in-memory differential offset table.
2019-05-27 19:49:25 -04:00
David Korth
d22c7ff6d7 Use SPDX license identifiers instead of the license notice. 2019-05-25 11:33:18 -04:00
David Korth
d07febcd78 Mst::saveMST(): Fixed some size_t conversion warnings on 64-bit MSVC 2017. 2019-05-21 21:47:52 -04:00
David Korth
a191588e60 Mst::saveMST(): Convert string names to Shift-JIS when saving.
TextFuncs.hpp: Added cpN_to_utf8().
- TextFuncs_iconv.cpp: Copied this function from rom-properties.
- TextFuncs_win32.cpp: Removed `static` from this function.

TODO: Show warnings for strings with characters that can't be
converted to Shift-JIS?
2019-05-21 21:39:43 -04:00
David Korth
26d47877b5 Mst: Save and load MST version and endianness to/from the XML file.
If found, the specified version and endianness will be used.
If not found, "1B" will be assumed.
2019-05-20 21:54:50 -04:00
David Korth
bb6bb45a84 Mst: Changed m_version to char and store '1', not 1. 2019-05-20 21:49:05 -04:00
David Korth
c50ec4b87c Mst::saveMST(): m_version should have the full character, not just an int version.
TODO: Change it from `uint8_t` to `char`?
2019-05-20 21:45:00 -04:00
David Korth
20067d3f8d Mst::saveMST(): Use m_version and m_isBigEndian for version and endianness.
This includes handling byteswapping properly for little-endian files,
though this hasn't been fully tested with programs. (Sonic'06 is
big-endian only, so we can't really test it there.)
2019-05-20 21:20:43 -04:00
David Korth
2d0d076d3b Mst: Implemented saveMST().
TODO:
- Convert message names from UTF-8 to Shift-JIS.
- Test everything! (Convert from MST to XML, then MST, then back to XML.)
2019-05-20 19:30:07 -04:00
David Korth
22270c6237 Mst::loadMST(): Renamed mst to mst_header.
For consistency with the upcoming saveMST() function.
2019-05-20 19:01:55 -04:00
David Korth
3213aa21f6 Mst::loadXML(): Unescape message text.
TODO: unescape() overload that takes `const char*`?
2019-05-12 15:16:42 -04:00
David Korth
3ab8e2b410 main.cpp: TODO: Print load errors for loadXML(). 2019-05-12 14:40:52 -04:00
David Korth
c2014f11e1 Mst: Added overloads that take FILE* instead of a filename.
main.cpp: Use these overloads.
2019-05-12 14:40:29 -04:00
David Korth
490bfe1413 Mst.cpp: Close the XML files after loading and saving.
This fixes two resource leaks.
2019-05-12 14:37:20 -04:00
David Korth
65a6fe8281 main.cpp: _tfopen()'s mode parameter must be _T().
This fixes the Windows build.
2019-05-12 14:34:01 -04:00
David Korth
e794658230 Mst::loadXML(): Added this function.
TODO:
- Add more error checking, e.g. for skipped indexes.
- Add saveMST() to convert XML to MST.

main.cpp: Check the file type by looking at magic numbers and call the
appropriate load function.
2019-05-12 14:27:53 -04:00
David Korth
ca6857332c doc/MST_format.md: +linebreak 2019-05-11 16:23:10 -04:00
David Korth
c0ed8c3412 doc/MST_format.md: Clarify that zero might be missing entirely. 2019-05-11 16:22:40 -04:00
David Korth
45d9307641 doc/MST_format.md: Mention alignment requirements.
The differential offset table must be DWORD-aligned, both for its starting
offset and for its total size. Message text and names *don't* have to be
DWORD-aligned, though text usually is, and names might be. (Names is
always WORD-aligned, since message text is encoded as UTF-16.)
2019-05-11 16:16:38 -04:00
David Korth
b1979b3ded doc/MST_format.md: Added documentation for the MST format. 2019-05-11 16:11:31 -04:00
David Korth
921931b55c mst_structs.h: Renamed the "offset table" to the "differential offset table".
There's two offset tables:
- Regular offset table, which points to strings.
- Differential offset table, which points to offsets in the first
  offset table.

Rename the header fields that point to the differential offset table
in order to reduce confusion.
2019-05-11 15:57:48 -04:00
David Korth
f233db23c6 Mst: Fix incorrect array bounds checks for string accessors.
- strName(): Off by one.
- strText_utf8(), strText_utf16(): Completely wrong check.
  - "<" -> ">="
2019-05-11 15:46:54 -04:00
David Korth
87395b2cba Mst: Added escape() and unescape() functions.
These functions escape and unescape backslash, newline ("\n"), and
form feed ("\f").

MST messages use newline for newlines (obviously), and form feed to
indicate a pause where the user has to press a button to continue
reading the message.

These should be escaped when editing to make it easier for the user
to manipulate things.

saveXML(): Escape string text.
2019-05-11 15:43:19 -04:00
David Korth
65ce386b0b main.cpp: Accept an output filename parameter.
Just in case the user doesn't want to use the default of
"input filename with .xml extension instead of .mst".
2019-05-11 15:30:36 -04:00
David Korth
aa4019b94d Mst: Added saveXML() to save the string table in XML format.
Added an internal copy of TinyXML2 for Windows builds.
Based on the rom-properties build.

main.cpp: Replace the file extension on the source file with .xml and
use that as the destination filename.
2019-05-11 15:27:42 -04:00
David Korth
4f7df957eb Mst::loadMST(): Convert the string table name from Shift-JIS to UTF-8. 2019-05-11 15:11:35 -04:00
David Korth
8cf572184a Mst: Added string name and text accessor functions.
String name is always returned as UTF-8.

String text can be returned as either UTF-8 (converted on the fly)
or UTF-16 (copied directly).
2019-05-11 13:31:28 -04:00