Follow Us

Advanced Business Analytics Solution– Commercial Taxes Department, Uttar Pradesh


A crucial limb of the government of Uttar Pradesh, the Commercial Taxes department is charged with maximizing the revenue collection for each term, thus helping the government in implementing social welfare schemes and programs. The department is also tasked with the levying and collection of VAT, CST, and Entry Tax and many others. In order to expand the tax base, it registers and monitors dealers. It generates receipts for and scrutinizes returns that are filed with the department. Lastly, it sanctions refunds; and audits and adopts measures to curb tax-evasion among dealers.


The department was facing several challenges in terms of work and flow, the major one of which was the selection of individuals and businesses for auditing. Apart from that, tax collection data accumulated from years of work needed to be analysed, and revenue forecasting implemented hence. During the process, the department also needed to identify probable tax-evaders and defaulters, and also risk profile other dealers to detect patterns in behaviour and business.


In order to analyse data in the department, online records dating back three years were collected, bringing the total amount to information to 4 terabytes. For the purpose of analysing dealer profiles, registration data and e-return data tables (a total of 10 crore) records were also collected. With data of such monumental proportions, the team faced a challenge in terms of analysing, for the records themselves were redundant, repetitive, bifurcated and ambiguous. The problem was solved by preparing the data before going any further.
For the same, data from the original records (sized 4 TB) was extracted and consolidated onto another source, reducing the size to about 50 GB. This consolidated data source was the one that the team worked on using SPSS and Cognos. This was especially helpful in filtering out data, particularly when it came to dealers, only 30% of whom were on online platforms. Additionally, it made easier to filter out the 10 crore records that had constituted the original data.
The team conducted a zone-wise tax reconciliation, through which it showed that the purchases shown by the seller in their returns matched the purchases shown by the customer for the same period. Apart from that, the team also conducted a risk assessment of all dealers in an area based on the data available, and gave them ratings on their safety based on the examination factors.
Data reports also showed the total sales in an area, which could be used for forecasting and determining which the top sales zones were. Apart from that, the team also analysed data to predict the top purchase zones in an area.
For sellers that turned up red flags according to risk assessment, separate analyses were conducted under certain sales ratios. If one ranked at and above 50% in a particular sales ratio, he or she was considered a risky seller. The same process was also carried out for purchases to determine those that seemed like a bad investment.
Additionally, the team also analysed data to determine department from which the highest amounts of challan were collected. Also based on prepared data were consolidated dealer clusters, which classified the dealers into different categories according to total sales, purchases and ITC behaviour.
One of the biggest benefits of the report was that it could be utilized to forecast sales, purchases and challan collections for the future. The team compiled the same into a comprehensive graph as well, which showed the previous, current and future data (up to one month).
Another advantage of the report was that it became incredibly easy to trace circular trading patterns based on the data analysed. Therefore, the department could then determine easily which dealers needed to investigated based on their commodity details, assessment year and sales details, and purchase details.


  • One of the most advantageous aspects of the task was the filtering of data into a prepared database, which helped the team get rid of repetitive, redundant and erroneous information, which, in turn, led to the reduction of the time needed to complete the project.
  • The team’s strategic and step-by-step approach to analysing the data ensured that all aspects were covered, and that correct information was derived from all calculations.
  • Conducting risk assessment of all the dealers allowed the team to discern for the department who the shady business prospects were, thus avoiding potentially harmful business ventures in the future.
  • Additionally, analysing trading patterns, top sales and purchases and forecasting for the future allowed the team to create a comprehensive picture of the business scenario for the department, which deals in large number frequently and has thus need of a good projection for future dealings.