Data Mining

Nowości w Datamining

Monday, March 9th, 2009 | Data Mining | No Comments

Nowe oprogramowanie Wolframa wyławia nieustrukturyzowane dane z Sieci, strukturyzuje je, uruchamia na nich swoje algorytmy i w efekcie generuje „fakty” i „odpowiedzi” w odpowiedzi na zapytania zadawane poprzez przypominający Google formularz wyszukiwania.

http://www.techcrunch.com/2009/03/08/wolfram-alpha-computes-answers-to-factual-questions-this-is-going-to-be-big/

VBA MsWord - Document Mining

Friday, February 13th, 2009 | Data Mining, VBA | No Comments

http://word.mvps.org/FAQs/MacrosVBA/index.htm

Ms Word - VBA - MVPS - FAQ

 

http://www.kayodeok.btinternet.co.uk/favorites/kbofficeword.htm

Using Visual Basic .NET from VBA to Serialize Word Documents as XML

http://msdn.microsoft.com/en-us/library/aa140276(office.10).aspx

Transforming Word Documents into the XSL-FO Format

http://msdn.microsoft.com/en-us/library/aa537167(office.11).aspx

XSL-FO is an intermediate form that results from applying an XSLT style sheet to an XML structured document. The XML-FO form describes how pages appear when presented to a reader, such as a Web browser. Currently, there are no readers that directly interpret an XSL-FO document. To interpret them, you must run them through a formatter, along with other data, such as graphics and font metrics, to create a final displayable or printable file. Possible formats for the resulting file include Adobe’s Portable Document Format (PDF) and Hypertext Markup Language (HTML).

When compared to Cascading Style Sheets (CSS), XSL-FO provides a more sophisticated visual layout model. You can use CSS to apply specific style elements to an XML or HTML document. By contrast, XSL-FO is a language for describing a complete document. It includes everything needed to paginate and format a document. Some of the formatting supported by XSL-FO, but not by CSS, includes right-to-left and top-to-bottom text, footnotes, margin notes, page numbers in cross-references, and more. Note that while CSS is primarily intended for use on the Web, XSL-FO is designed for broader use. As an example, you could use an XSL-FO document to lay out an XML document as a printed book. You could write a completely separate XSL-FO document to transform the same XML document into HTML.

XPath Tutorial - WC3 School

http://www.w3schools.com/xpath/default.asp

XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.

XSL-FO Tutorial

http://www.w3schools.com/xslfo/xslfo_intro.asp

What XSL-FO is, and how to use XSL-FO to format your XML documents for output.

Tags:

Drzewa decyzyjne

Tuesday, January 13th, 2009 | Data Mining | No Comments

Dwa linki:

Ang: http://www.autonlab.org/tutorials/dtree18.pdf
Pl: http://www.fizyka.umk.pl/~duch/zajecia/05SemMagInf/03DT.pdf

;)

Tags:

Poprawność numeru PESEL w SQL

Wednesday, November 19th, 2008 | Data Mining | No Comments

Algorytm znalazłem na stronie:
http://wipos.p.lodz.pl/zylla/ut/pesel.html

W pierwszym kroku tworzę tabelę tymczasową zapisując w kolumnach kolejne cyfry numeru pesel,

W drugim kroku tworzę tabelę z listą numerów oraz wynikiem testu.

proc sql;
create table pesele_temp as select
id,
pesel,
input(substr(trim(pesel),1,1),commax1.) as a1,
input(substr(trim(pesel),2,1),commax1.) as a2,
input(substr(trim(pesel),3,1),commax1.) as a3,
input(substr(trim(pesel),4,1),commax1.) as a4,
input(substr(trim(pesel),5,1),commax1.) as a5,
input(substr(trim(pesel),6,1),commax1.) as a6,
input(substr(trim(pesel),7,1),commax1.) as a7,
input(substr(trim(pesel),8,1),commax1.) as a8,
input(substr(trim(pesel),9,1),commax1.) as a9,
input(substr(trim(pesel),10,1),commax1.) as a10,
input(substr(trim(pesel),11,1),commax1.) as a11
from tabela
;quit;

proc sql;
create table pesele_v as select
id,
pesel,
ifn (mod( a1 * 9 +
a2 * 7 +
a3 * 3 +
a4 * 1 +
a5 * 9 +
a6 * 7 +
a7 * 3 +
a8 * 1 +
a9 * 9 +
a10* 7
,10)=a11, 1, 0) as TEST
from pesele_temp
;quit;

Tags: ,

Data mining - ciekawy link

Saturday, October 25th, 2008 | Data Mining | No Comments

Bogaty zbiór linków do materiałów traktujących o data miningu, statystycznej analizie danych etc. (drzewa decyzyjne… )

http://www.autonlab.org/tutorials/

Tags: ,

Jakość danych - FreqAll

Saturday, October 25th, 2008 | Data Mining, SAS | No Comments

Otrzymujesz zbiór danych do analizy, mniej lub bardziej uporządkowany i opisany. Przed rozpoczęciem jakichkolwiek analiz wypada sprawdzić jakość jaką reprezentuje zbiór.

Makro %freqAll znakomicie ułatwi Ci życie:

http://www2.sas.com/proceedings/forum2008/007-2008.pdf

Tags: ,

Search