Data Mining
Nowości w Datamining
Monday, March 9th, 2009 | Data Mining | No Comments
Nowe oprogramowanie Wolframa wyławia nieustrukturyzowane dane z Sieci, strukturyzuje je, uruchamia na nich swoje algorytmy i w efekcie generuje „fakty” i „odpowiedzi” w odpowiedzi na zapytania zadawane poprzez przypominający Google formularz wyszukiwania.
VBA MsWord - Document Mining
Friday, February 13th, 2009 | Data Mining, VBA | No Comments
http://word.mvps.org/FAQs/MacrosVBA/index.htm
Ms Word - VBA - MVPS - FAQ
http://www.kayodeok.btinternet.co.uk/favorites/kbofficeword.htm
Using Visual Basic .NET from VBA to Serialize Word Documents as XML
http://msdn.microsoft.com/en-us/library/aa140276(office.10).aspx
Transforming Word Documents into the XSL-FO Format
http://msdn.microsoft.com/en-us/library/aa537167(office.11).aspx
XSL-FO is an intermediate form that results from applying an XSLT style sheet to an XML structured document. The XML-FO form describes how pages appear when presented to a reader, such as a Web browser. Currently, there are no readers that directly interpret an XSL-FO document. To interpret them, you must run them through a formatter, along with other data, such as graphics and font metrics, to create a final displayable or printable file. Possible formats for the resulting file include Adobe’s Portable Document Format (PDF) and Hypertext Markup Language (HTML).
When compared to Cascading Style Sheets (CSS), XSL-FO provides a more sophisticated visual layout model. You can use CSS to apply specific style elements to an XML or HTML document. By contrast, XSL-FO is a language for describing a complete document. It includes everything needed to paginate and format a document. Some of the formatting supported by XSL-FO, but not by CSS, includes right-to-left and top-to-bottom text, footnotes, margin notes, page numbers in cross-references, and more. Note that while CSS is primarily intended for use on the Web, XSL-FO is designed for broader use. As an example, you could use an XSL-FO document to lay out an XML document as a printed book. You could write a completely separate XSL-FO document to transform the same XML document into HTML.
XPath Tutorial - WC3 School
http://www.w3schools.com/xpath/default.asp
XPath is a language for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.
XSL-FO Tutorial
http://www.w3schools.com/xslfo/xslfo_intro.asp
What XSL-FO is, and how to use XSL-FO to format your XML documents for output.
Drzewa decyzyjne
Tuesday, January 13th, 2009 | Data Mining | No Comments
Dwa linki:
Ang: http://www.autonlab.org/tutorials/dtree18.pdf
Pl: http://www.fizyka.umk.pl/~duch/zajecia/05SemMagInf/03DT.pdf
![]()
Poprawność numeru PESEL w SQL
Wednesday, November 19th, 2008 | Data Mining | No Comments
Algorytm znalazłem na stronie:
http://wipos.p.lodz.pl/zylla/ut/pesel.html
W pierwszym kroku tworzę tabelę tymczasową zapisując w kolumnach kolejne cyfry numeru pesel,
W drugim kroku tworzę tabelę z listą numerów oraz wynikiem testu.
proc sql;
create table pesele_temp as select
id,
pesel,
input(substr(trim(pesel),1,1),commax1.) as a1,
input(substr(trim(pesel),2,1),commax1.) as a2,
input(substr(trim(pesel),3,1),commax1.) as a3,
input(substr(trim(pesel),4,1),commax1.) as a4,
input(substr(trim(pesel),5,1),commax1.) as a5,
input(substr(trim(pesel),6,1),commax1.) as a6,
input(substr(trim(pesel),7,1),commax1.) as a7,
input(substr(trim(pesel),8,1),commax1.) as a8,
input(substr(trim(pesel),9,1),commax1.) as a9,
input(substr(trim(pesel),10,1),commax1.) as a10,
input(substr(trim(pesel),11,1),commax1.) as a11
from tabela
;quit;
proc sql;
create table pesele_v as select
id,
pesel,
ifn (mod( a1 * 9 +
a2 * 7 +
a3 * 3 +
a4 * 1 +
a5 * 9 +
a6 * 7 +
a7 * 3 +
a8 * 1 +
a9 * 9 +
a10* 7
,10)=a11, 1, 0) as TEST
from pesele_temp
;quit;
Data mining - ciekawy link
Saturday, October 25th, 2008 | Data Mining | No Comments
Bogaty zbiór linków do materiałów traktujących o data miningu, statystycznej analizie danych etc. (drzewa decyzyjne… )
Jakość danych - FreqAll
Saturday, October 25th, 2008 | Data Mining, SAS | No Comments
Otrzymujesz zbiór danych do analizy, mniej lub bardziej uporządkowany i opisany. Przed rozpoczęciem jakichkolwiek analiz wypada sprawdzić jakość jaką reprezentuje zbiór.
Makro %freqAll znakomicie ułatwi Ci życie:
http://www2.sas.com/proceedings/forum2008/007-2008.pdf