Today I ran into a lot of trouble. I was working on a Word (2016) document. Happily typing away and saving the document from time to time to SharePoint Online.
When I was finished, I decided to share the document and test the link provided to me. Instead of the document, I got an error message. Not to hopeful, I decided to open the document in Word instead of Word Online.
And then it happened….
Ok. These things do happen. I guess. And I’m not fainthearted. So, I decided to solve this.
I have. But I has been a struggle. Let me guide you through the process.
Office XML format
First thing to remember is that Office uses the Office XML format. Every .docx document is basically a zip file. The content is stored within, including the document metadata, structure and even some SharePoint metadata.
So, step 1: replace the .docx with .zip.
Next, it is time to find the culprit. In this case, it is the document.xml file. This can be found in de word folder.
Step 2: Open document.xml with Notepad++
To open this XML file and recover the document, I used Notepad++. I even used the XML plugins, but these weren’t useful to me (in the end). So, let’s open the document and remember the error: Line 2.
Ok. So the error’s in line 2. Line 2 has all the XML code stored within. And I can’t use the XML plugins, because of the error. We can get around this, thanks to some information on the web.
Step 3: Replace some code
Within the XML file, replace all >< with >\r\n<.
What this does, is create a new line for every XML line. And this is very handy.
You will see all the XML now.
Step 4: Save the document.xml back
A weird step this, but it works. Save the document.xml again. Use the Save a copy function for this. Notepad++ will detect errors in the XML and won’t save the original. But a copy (using the same name) will do the trick.
After saving the file, include it into the .zip document and rename it back to .docx.
Step 5: Done
Yeah, right. No, it’s not that easy. Now that you’ve saved the document.xml file back into the .docx file, Word will provide us with some more information. Try to open the document again. You will get the same error, but this time: the exact line you need.
In this case, line 720. Don’t bother with the column.
Step 6: Modify the document.xml.
Now we know what to look for. Don’t bother trying to get to the reason for the error. All code looks healthy to me. Just go to the line stated.
Here you will either find a line beginning with </ or <.
Now comes the hard part. Find the entire section of the line. In my example, the line is </w:pict>, so I have to look for the connecting <w:pict> part.
This part begins at line 653. I select all of the code and simply delete it.
Now save the file again, place it back into the .zip file. Rename the .zip file to .docx and try to open it. In my case, I got another error message. This one also to do with <w:pict>. I removed that section as well.
Step 7: Success
In the end, when you have removed all the sections, Word will open the document. In my case, no content was harmed using this procedure. And I was very, very happy.
Woah! Amazing and worked great! I ended up having to use sed and vi because notepad++ kept crashing, but the principles were the same and I got my document back!
Great that is worked and thanks for the feedback!
I cannot fix it and it says corrupted. is there any you can help me?
I tried and could not fix it using this method. I am not tech-savvy. I wonder whether you can help me.
I can try. Can you send me the document? Just send it to “firstname.lastname@example.org” and I’ll see if I can help.
It took 5 versions but far better than recreating a 30 page document. Thanks for the help
Thanks for the response 👍
Thank you so much. It works good..
It works like a sharm, you saved me days of work *.* and provide a lot of “nerd fun” and satisfaction.
I can’t recover my doc… Is there anyone that can help me please? 😥
Sorry – sometimes the file is to corrupted. But if you can share the document, then perhaps I can take a look….
Dear Albert, Good morning.
I can’t open my docx file. I have this message “The name in the end tag of the element must match the element type in the start tag. word/document.xml, line: 2, column: 4328”. Would you please help. Thanks
I just send you and e-mail 🙂
Could’u email me too? I’m with the same problem
Sorry, I’m afraid I won’t be able to help.
But if you follow the instructions, these will help.
If not, then I’m afraid even I cannot solve the problem.
Hi, thank you for posting these instructions! They worked wonderfully for one of my documents, saved me weeks of work! However, I could not restore another word document, the error keeps coming up over and over again, the next error always one row below and finally, the almost the whole document’s lines were ‘involved’. Could you please tell me how can I fix the document in this case? What can I do to avoid these errors coming up endlessly in this word document?
I would love to help. It’s a bit difficult without the file. But you might try this as well:
Hope this helps!
Thanks for the link! Unfortunately it could not open the document either, but I managed to recover some of my notes from a different file.