Tutorial :How should I begin this kind of project?


In order to understand how other people work with project solving I put up this project formulation:

  • The project involves several different customer databases where data should be read and uploaded to the project's database server.

  • Close to nothing is generally known about the customer's database when a new customer arrives, apart from some general knowledge about the business type of the customer.

  • It requires a program to read the data from the customers.

  • It requires a web site with diverse cross sections of the customers' data, in an attempt to do something in a general way for the different customers.

  • It requires the handling of a several GB database. And synchronization of millions of rows with the customers' databases.

  • The visual appearance and functionality of the web site should be dazzling, including charts, report-server-like functionality with email and sms reports.

  • The different customers will probably also have different requirements so the system should be parameterized in some way.

  • The different users will probably also like to have some personalized pages.

  • Some advertisement pages for the project, documentation and manuals will probably also be needed.

  • The web pages should load faster than 0.1 second and serve hundreds of simultaneous users.

How would you approach such a requirement?

How many people would you take on the project, initially?

Which different specialities / expertises would you expect to need?

How carefully would you plan such a project?

EDIT: OK it might sound unrealistic, but what should the first steps be and what kind of organization would be capable of handling this appropriately?


There are no requirements listed in this question other than:

  • Large unknown data files will be uploaded and processed into a database.
  • This large data will be displayed on a website.
  • The website should be dazzling.
  • The website should be fast even with hundreds of users.

I recommend you hire someone to get better requirements. If you want technology recommendations, you might get them here:

  • C is fast. (but so are many other things)
  • Flash websites are dazzling. :(
  • Apache or IIS in combination with MySQL/PostgreSQL/MSSQL are both capable of handling a large server load.



How would you approach such a requirement?


How many people would you take on the project, initially?

Day one? one â€" me. Day two? probably more.

Which different specialities / expertises would you expect to need?

To start with: A databases guy, a back-end guy, a font-end guy, and a psychic to guess the unknown database structures

How carefully would you plan such a project?

Not very. Actually, I wouldn't accept this project, not without better specs.


Response via a mindmap
http://itprojectguide.org/files/projectx.png - direct link to image alt text http://itprojectguide.org/files/projectx.png


To me it seems like a data warehouse project, if I understood correctly your problem.

From technical perspective:

You need to setup a "staging area" for each customer database, where to put the relevant data to be loaded in your central system.

Your job will then be to load the data from each staging area, transform it to a common format and store it in your database.

Then, use reporting tools over your database to build nice reports, data mining, etc.

You can use specialized ETL tools (they might be pretty expensive) or use simple SQL combined with some procedural language and scripts for data transformation and loading.

You can use specialized reporting tools (e.g. Business Objects) to build your reports over the built data warehouse. One of their feature is that they are thought to allow the end user to customize and build their own reports as well.

From staffing perspective:

You'll need people that worked in datawarehouse. I can't say anything about the sizing of the team, since it depends on the number of data sources and the complexity.


I would suggest the following steps:

  1. Use the first customers to identify common data model for the warehouse.
  2. Implement the common datawarehouse features (common model, metadata, basic framework for ETL transformation, basic support for reports). This will set up the "infrastructure". Hardware setup and sizing is also needed here.
  3. Use iterations for setting up customers. On each iteration, analyze source data, develop the data filter to select only the relevant data to be loaded in the datawarehouse, design and implement the data transformation, design and implement new reports, if necessary.

You can treat each new customer to be handled as a separate iteration, and treat multiple customers in parallel iterations, once the datawarehouse model and structure (the infrastructure) is in place.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »