A Selection of Successful Software Engineering Posts - Part 1




This selection (Part 1) contains 25 blog posts selected by CodeBalance which were placed on DZone in 2010.

Project Management & IT Business
You're a Bad Manager. Embrace It.

Software Architecture

Modelling/Analysing

About Programmers/Software Engineers:

Programming/Software Engineering Practices

Java

.NET

Mobile Programming

Development Tools

and Some Fun...


Posted in | 1 Comment

10 Differences Between WCF and ASP.NET Web Services




Here are the 10 important differences between WCF Services and ASP.NET Web Services:

10 Differences Between WCF and ASP.NET Web Services

For details: http://msdn.microsoft.com/en-us/library/aa738737.aspx

Posted in , | 13 Comments

A Theorical Introduction to Data Mining




This article introduces the aim of data mining and explains basic concepts and terms.

Data Mining (i. e. Knowledge discovery from data): Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data.

Data Warehouse : A single, complete and consistent store of data obtained from a variety of different sources made available to end users in a what they can understand and use in a business context. [Barry Devlin] Data warehouses are used for data mining.

Potential Usages : Web information mining,  spam filtering, medical data mining, weather data mining, market sale strategies etc.

Data Mining Related Operations
Preprocessing:
Handling Noisy Data : Handling missing, duplicate or errorneous data before data mining. Noisy data can be removed, or corrected by a specific approach (i.e. correlation analysis).
Integration  : Combining data from multiple sources.
Normalization : Scaling data to specified range. For example, scaling 750 in [500, 1000] to range [0,1] (the result is 0.5) 
Feature Selection : Selecting only useful features (i.e. attributes for record data) of data.

Data Mining:
Classification: Finding a model for a class attribute of data to predict the values of other attributes. (An example class attribute: CustomerBuysProduct (bool))
Different methods can be used for classification:
  • Decision Trees: Uses decision trees to make model and evaluates new data on the tree.
  • Rule-Based Classifying: Deduces rules on the data (if X = Y and if Z z T result is W etc.).
  • Bayes Classifying: Uses previous probabilities to classify.
  • K-Nearest Neighbor Classifying: Uses distances between previous data to new data, to classify.
  • ...
Clustering: Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.
 Different methods can be used for clustering:
  • K-means Clustering: Splits data according to a previously known number of clusters.
  • Hierarchical Clustering: Produces a set of nested clusters organized as a hierarchical tree.
  • ...
Association (Rule) Discovery: Producing dependency rules which will predict occurrence of a feature (i.e. attribute) of data based on occurrences of other features.
Pattern Discovery: Deducing patterns as a result of classification, clustering, Pattern discovery etc.

Postprocessing: Evaluating and selecting interesting patterns, interpreting and visualizing them as an information report.

Posted in | 3 Comments