The interwebs really let me down today. I had this one simple thing I wanted to do with a database extract that needed some editing. Man on man, did it take much more work than anticipated.
Some of the fields in the extract contain HTML formatting the cannot be used in an Excel spreadsheet report. Turns out I was able to find the method for cleaning out the HTML pretty easily. I found a stackoverflow.com conversation that helped with this:
https://stackoverflow.com/questions/9999713/html-text-with-tags-to-formatted-text-in-an-excel-cell
What was hard to find was a method to grab that de-HTMLed text from the clipboard. Turns out I would have to leverage “Microsoft Forms 2.0 Object Library” by adding it manually. stackoverflow.com to the rescue again:
Ended up with the following macro which will clean up HMTL tags in the text of a selected cell and insert the cleaned up text into that same cell.
Sub HTMLTextCleanup()
Dim Ie As Object
Dim DataObj As MSForms.DataObject
Set Ie = CreateObject("InternetExplorer.Application")
Set DataObj = New MSForms.DataObject
With Ie
.Visible = False
.Navigate "about:blank"
.document.body.InnerHTML = ActiveCell.Value
'Transform the text in the selected cell to remove the HTML
.ExecWB 17, 0
'Select contents of browser render
.ExecWB 12, 2
'Copy the contents of browser render in the clipboard
Selection.ClearContents
'Delete HTML text from the active cell
ActiveCell.Select
DataObj.GetFromClipboard
ActiveCell.FormulaR1C1 = DataObj.GetText(1)
'Grab the clipboard text and paste it into the cell
'This makes sure that all the text is copied into one cell
'even if paragraph breaks are present
.Quit
End With
End Sub
Putting this out in the world in case it can help someone else do the same.