Home Data Prep Q&A

Paxata Community Members: Something special in a community experience is coming your way. Stay tuned to this space.
In the meantime, check out the brand new Data Prep for Data Science topic here and the new DataRobot Community.

Visit the official Paxata Documentation portal for all of your doc needs.

Leading zeroes for importing from Excel on Sharepoint

I am trying to import an Excel spreadsheet from a Sharepoint Site as a data set.  There are 4 columns with leading zeroes that are formatted as Text.  They are being imported as numbers which really messes up the data set.  I don’t know if this makes a difference or not but the cell value in the spreadsheet itself is a formula driven field and is not constant.  Is there a work around for this?  

Answers

  • @willm125

    Yes, it's the formula. The library that we use is evaluating the formula result to be a numeric because it contains only 0-9. During import, this results in the leading zeroes being stripped. 

    I tried several formula tricks in Excel to force the evaluation to string. Ultimately, I had to include a non-numeric character. I used n "X" character.  
    • =TEXT(<Cell Name>,"X00000")
    • Within a Project you can easily strip this leading character. 

    It's not perfect, but it should get you running. 
  • The problem is it is not my data and several other reports, including mine, rely on this without the "X".  
  • bstephensbstephens Posts: 44 mod
    edited February 14, 2020 9:48PM
    You could load the Excel as is, allowing the leading zeroes to be stripped.  Then, in the projects that use this value, add the leading Zeroes back using the PADLEFT function. 
    • PADLEFT(@<columnsName>@, <endingStringLength>, "0" )
    Thoughts on that approach?

    Thanks,
    Bill
  • Still not going to work.  Some of these codes need to be 6 characters, some 8 and some 9.  There is nothing in the data set to tell us which is which and the Excel formula does it for us.  That is why we need to have the leading zeroes brought in.  To me, Paxata should allow us to tell it when importing if it is a text field or a numeric field for ease of use and allow us to make that determination to eliminate the extra steps.  We rely on leading zeroes in almost all of our data sets and I can't believe we are the only ones with this issue.
Sign In or Register to comment.