Command Syntax
# Parse unformatted xml files, adding line breaks and indentation xmllint YOUR_FILE.xml --format > OUTPUT_FILE_NAME.xml
Details
Recently I have come across a huge (over 4GB large) XML file that I tried to open and simply nothing worked. Literally nothing.
I tried to open it in Notepad++, VS Code, even the VI editor but without any luck.
I either received a warning saying I've got not enough memory or devastatingly VI - that I had the most hope on - simply crashed!
Turned out the XML file contained no line breaks to all of the parsers run out of memory before they could load the whole thing.
So the fix might be adding line breaks in the file, even add indentations to make it prettier?
On one side this would make the XML file bigger, the other hand it would make editors easier to open the XML as it would be already parsed so they could go line by line instead of loading the whole thing and then parsing it.
Example
Here we parse our largeXML.xml file that gave us a hard time earlier. It's located in c:\temp, so we parse it with the following command after opening the Linux subsystem (for detailed steps on how to configure the Linux subsystem please click here). Simply open Run (Win + R), type in bash and hit Enter.
Next, navigate to the folder where the file is located in. In my case it's the temp directory on the C: drive, which is accessible as '/mnt/c/temp' in the linux subsystem.
Missing command!
If you receive an xmllint: command not found error, install the libxml2 package in Debian based systems:sudo apt-get install libxml2-utils
Then run the xmllint command in the following format to add line breaks and indentation to the XML file:
# Navigate to C:\temp cd /mnt/c/temp # Parse the XML, saving it as largeXML_parsed.xml xmllint largeXML.xml --format > largeXML_parsed.xml
Verification
After parsing VIM has no isues opening the file.
Comments