The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Very basic clustering"
Hi,
I'm extremely new to both Data Mining and Rapid-Miner itself, just getting comfortable with simple load and aggregation operations etc.
I’ve searched for topics on this but anything I've found is still a little complicated for what I’m trying to do.
I'm interested in doing some basic clustering or classification on a single attribute of a dataset.
I’ve loaded some data from SQL to give me a count of transactions per day:
I have 4 attributes in my dataset:
Year Month Day Count
2010 10 5 345643
2010 10 4 2000
2010 10 7 2356
2010 10 5 18
2010 09 2 10010
2010 10 18 12
2010 01 5 34252
This is a sample, I have a year’s worth of data, so 365 items.
I’m trying to cluster into maybe 5 bins based on count size but I can’t seem to target one attribute using K-means or other algorithms.
Is what I’m trying to do too simplistic for Rapidminer operations? Need to try use RM for project I’m doing…..
Thanks,
kgbolger
I'm extremely new to both Data Mining and Rapid-Miner itself, just getting comfortable with simple load and aggregation operations etc.
I’ve searched for topics on this but anything I've found is still a little complicated for what I’m trying to do.
I'm interested in doing some basic clustering or classification on a single attribute of a dataset.
I’ve loaded some data from SQL to give me a count of transactions per day:
I have 4 attributes in my dataset:
Year Month Day Count
2010 10 5 345643
2010 10 4 2000
2010 10 7 2356
2010 10 5 18
2010 09 2 10010
2010 10 18 12
2010 01 5 34252
This is a sample, I have a year’s worth of data, so 365 items.
I’m trying to cluster into maybe 5 bins based on count size but I can’t seem to target one attribute using K-means or other algorithms.
Is what I’m trying to do too simplistic for Rapidminer operations? Need to try use RM for project I’m doing…..
Thanks,
kgbolger
Tagged:
0
Answers
to exclude everything but the "count"-attribute,
you can e.g.
a.) use "set role" for the other
attributes, make them special attributes and thus
exclude them from the cluster analysis. or
b.) use "select attributes" to grab just that single
attribute that interests you before clustering.
greets,
rené
Simple example:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.0.11" expanded="true" name="Process">
<process expanded="true" height="404" width="748">
<operator activated="true" class="generate_sales_data" compatibility="5.0.11" expanded="true" height="60" name="Generate Sales Data" width="90" x="41" y="45"/>
<operator activated="true" class="select_attributes" compatibility="5.0.11" expanded="true" height="76" name="Select Attributes" width="90" x="179" y="75">
<parameter key="attribute_filter_type" value="subset"/>
<parameter key="attributes" value="single_price|transaction_id"/>
</operator>
<operator activated="true" class="k_means" compatibility="5.0.11" expanded="true" height="76" name="Clustering" width="90" x="380" y="75">
<parameter key="add_as_label" value="true"/>
<parameter key="k" value="5"/>
</operator>
<connect from_op="Generate Sales Data" from_port="output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Clustering" to_port="example set"/>
<connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
<connect from_op="Clustering" from_port="clustered set" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>