Microsoft aims to take the work out of data wrangling with coming ‘Pendleton’ tool

With its growing emphasis on all things AI — coupled with its history as a tool vendor — it’s not surprising that Microsoft is working on tools not just for traditional programmers, but also data scientists.According to a Microsoft Research presentation from earlier this year, data scientists currently spend 80 percent of their time extracting and cleaning data — AKA “data wrangling.” Microsoft wants to fix this.

Pendleton is a client app that works on Windows, OS X/macOS. Its design runtime uses Python and depends on various Python libraries.

The tool can do things like remove errant columns, change formatting in columns, handle missing data and the like. It also includes analytics tools to help data scientists figure out what’s included in a dataset. Pendleton can read data from SQL Server, Azure Blobs and Data Lakes. It also can read files from local PC files, my contact said.

Meanwhile, speaking of data science and big datasets, Microsoft and Facebook announced today a new standard they developed together for representing deep-learning models that allows these models to be transferred between frameworks.

Source

 

Leave a Comment

Your email address will not be published. Required fields are marked *