SERGE AURUBIN

DATA SCIENTIST

Welcome to my blog for everything having to do with data science and analysis. I am always looking for new opportunities to drive business insight using data analysis.

SKILLS:

  • DATA ANALYSIS

  • EXCEL

  • PYTHON

  • POWER BI

  • FRONT END DEV

  • FLUTTER

  • BACK END DEV

  • REACTJS

  • PHP

  • MYSQL DEV/DBA

  • GIT

New Kid On The Block - Playwright

If you are a fan of web automation, then you have probably heard about the kid on the block known as Playwright. There are other tools such as Selenium and requests. The user interface and the dev documentation makes this one of the go to tools for web automation

One example is when I need access to data but I do not have access to the database. The only thing I have access to is an admin interface where I can see all orders. Playwright does many of the things that Selenium does in that it opens a web browser and goes through the clicks necessary to get to the screens you need. You log into the admin screen and see all orders and then go through each order to get the information you need. The reason you would want to do it this way is that once you have the information, you can save it to a text file, or a database, or enter it in another web page. The main benefit of using Python automation is that once you have the data, you can interrogate it.

Don't Underestimate The Power of the Python Split Function

One of the ultimate data homewreckers is hands down the Python split function. This function is an absolute powerhouse for parsing data into data units that you can integrate into other data pipelines. I can't say this enough, hands down one of the most effective tools in my data arsenal.

At its core, the split() function does exactly what its name implies: it splits a string into a list of substrings based on a specified delimiter. This seemingly straightforward functionality, however, conceals a wealth of benefits and applications across various domains.

One of the primary advantages of the split() function lies in its ability to parse and extract relevant information from textual data swiftly. By specifying a delimiter, such as a space, comma, or any custom character sequence, Python can effortlessly dissect complex strings into manageable components. This capability proves invaluable in data processing tasks, such as parsing log files, extracting fields from CSV (Comma-Separated Values) files, or tokenizing natural language text.

Furthermore, the split() function enables developers to manipulate and transform data structures seamlessly. By breaking down a string into its constituent parts, programmers can iterate through the resulting list and perform operations on each element individually. This facilitates tasks such as data cleaning, where unwanted characters or whitespace can be easily removed, or data normalization, where disparate formats are standardized for consistency.

Keep This Tool Close In Your Python Toolbox

The main problem that developers have when browsing the Internet is the browser itself. There are Python tools that let you use the browser as a tool to prevent websites from identifying you as a bot and who knows if most of them really care. When you use the requests module in Python it allows you to pull content from the web and then integrate it into any other solutions.

Here is a good example.You need to collect information from a real estate website that has multiple pages with multiple listings. In those listings, you want to collect information on housing prices, assigned real estate agent, home owner, any contact information, square footage from the home, and email address. Based on the information collected and the estimated value of the home, you want to apply internal calculations and then upload those values into your CRM system.

Your CRM system has the responsibility to send communications to those contacts in a prescribed order and you continually test the order flow and email content for its effectiveness. This is an activity that if you assigned to someone, it would take them at least a few days to gather the data, perform the required calculations, and enter them into the CRM.

I have performed similar data pulls which takes than 15 minutes for the entire round trip for the process

This includes the time for the gathering of the data, the processing of data, and the entry into the CRM system.

The advantage once again is on the side of automation.

BOOK REVIEW

BOOK SUMMARY

Transforming data into revenue generating strategies and actions Organizations are swamped with data―collected from web traffic, point of sale systems, enterprise resource planning systems, and more, but what to do with it? Monetizing your Data provides a framework and path for business managers to convert ever-increasing volumes of data into revenue generating actions through three disciplines: decision architecture, data science, and guided analytics.

There are large gaps between understanding a business problem and knowing which data is relevant to the problem and how to leverage that data to drive significant financial performance. Using a proven methodology developed in the field through delivering meaningful solutions to Fortune 500 companies, this book gives you the analytical tools, methods, and techniques to transform data you already have into information into insights that drive winning decisions.

Beginning with an explanation of the analytical cycle, this book guides you through the process of developing value generating strategies that can translate into big returns. The companion website, www.monetizingyourdata.com, provides templates, checklists, and examples to help you apply the methodology in your environment, and the expert author team provides authoritative guidance every step of the way.