A MEAN task list ... demo application
Get the average value in an array
Statistical routines and probability distributions.
micromark extension to support GFM task list items
mdast extension to parse and serialize GFM task list items
Takes a Feature or FeatureCollection and returns the mean center.
Cast aria-hidden to everything, except...
Calculates the average angle of a set of lines, measuring the trend of it.
A shim for the setImmediate efficient script yielding API
Theia - Task extension. This extension adds support for executing raw or terminal processes in the backend.
Generate TypeScript types and interfaces using JavaScript or TypeScript code
A streaming parser for HTML form data for node.js
A simple tool to keep requests to be executed in order.
A library to convert between EGM96-relative altitudes and WGS84 ellipsoid-relative altitudes
A node API for the dprint TypeScript and JavaScript code formatter
High-priority task queue for Node.js and browsers
Yet another javascript fuzzy matching library
task item extension for tiptap
task list extension for tiptap
Chi-squared distribution expected value.
Takes a FeatureCollection of points and calculates the median center.
High-priority task queue for Node.js and browsers
Build out a lean, mean Modernizr machine.
Fuzzy match a command from a list (typo-safety)
BeTaskable is a small framework for creating and maintaining tasks / chores / assignments. Meaning something that someone has to do.
If your internet connection fails during a deploy, the deploy fails. And that means your site could stop functioning. Ouch. Deploy reliably and quickly from anywhere with slow or unreliable internet connections. It works by invoking the capistrano command from a remote server using `screen`. To ensure tasks run the same as if invoked locally, cap files are first copied to remote server via `scp`.
Ori is a library for Ruby that provides a robust set of primitives for building concurrent applications. The name comes from the Japanese word 折り "ori" meaning "fold", reflecting how concurrent operations interleave. Ori provides a set of primitives that allow you to build concurrent applications—that is, applications that interleave execution within a single thread—without blocking the entire Ruby interpreter for each task.
RubyNEAT -- Neural Evolution of Augmenting Topologies for Ruby. By way of an enhanced form of Genetic Algorithms -- the NEAT algorithm, populations of neural nets are evolved to handle pre-defined goals. RubyNEAT is the first implementation of the NEAT algorithm for Ruby, and it leverages Ruby's power to implement the NEAT algorithm in a way that would be difficult to do in other languages. The 'activation function' is largely standalone. Basically, activation is achieved by functional programming. Meaning, once your network is evolved, you can extract it as source code you can then utilize without the RubyNEAT engine. RubyNEAT can be used for nearly any Machine Learning task you can dream of, because it's also extensible and modular. See http://rubyneat.com for the details.
== OceanDynamo As one important use case for OceanDynamo is to facilitate the conversion of SQL databases to no-SQL DynamoDB databases, it is important that the syntax and semantics of OceanDynamo are as close as possible to those of ActiveRecord. This includes callbacks, exceptions and method chaining semantics. OceanDynamo follows this pattern closely and is of course based on ActiveModel. The attribute and persistence layer of OceanDynamo is modeled on that of ActiveRecord: there's +save+, +save!+, +create+, +update+, +update!+, +update_attributes+, +find_each+, +destroy_all+, +delete_all+, +read_attribute+, +write_attribute+ and all the other methods you're used to. The design goal is always to implement as much of the ActiveRecord interface as possible, without compromising scalability. This makes the task of switching from SQL to no-SQL much easier. OceanDynamo uses only primary indices to retrieve related table items and collections, which means it will scale without limits. OceanDynamo is fully usable as an ActiveModel and can be used by Rails controllers. Thanks to its structural similarity to ActiveRecord, OceanDynamo works with FactoryBot. See also Ocean, a Rails framework for creating highly scalable SOAs in the cloud, in which ocean-dynamo is used as a central component: http://wiki.oceanframework.net
minitest provides a complete suite of testing facilities supporting TDD, BDD, and benchmarking. "I had a class with Jim Weirich on testing last week and we were allowed to choose our testing frameworks. Kirk Haines and I were paired up and we cracked open the code for a few test frameworks... I MUST say that minitest is *very* readable / understandable compared to the 'other two' options we looked at. Nicely done and thank you for helping us keep our mental sanity." -- Wayne E. Seguin minitest/test is a small and incredibly fast unit testing framework. It provides a rich set of assertions to make your tests clean and readable. minitest/spec is a functionally complete spec engine. It hooks onto minitest/test and seamlessly bridges test assertions over to spec expectations. minitest/benchmark is an awesome way to assert the performance of your algorithms in a repeatable manner. Now you can assert that your newb co-worker doesn't replace your linear algorithm with an exponential one! minitest/pride shows pride in testing and adds coloring to your test output. I guess it is an example of how to write IO pipes too. :P minitest/test is meant to have a clean implementation for language implementors that need a minimal set of methods to bootstrap a working test suite. For example, there is no magic involved for test-case discovery. "Again, I can't praise enough the idea of a testing/specing framework that I can actually read in full in one sitting!" -- Piotr Szotkowski Comparing to rspec: rspec is a testing DSL. minitest is ruby. -- Adam Hawkins, "Bow Before MiniTest" minitest doesn't reinvent anything that ruby already provides, like: classes, modules, inheritance, methods. This means you only have to learn ruby to use minitest and all of your regular OO practices like extract-method refactorings still apply. == Features/Problems: * minitest/autorun - the easy and explicit way to run all your tests. * minitest/test - a very fast, simple, and clean test system. * minitest/spec - a very fast, simple, and clean spec system. * minitest/benchmark - an awesome way to assert your algorithm's performance. * minitest/pride - show your pride in testing! * minitest/test_task - a full-featured and clean rake task generator. * Incredibly small and fast runner, but no bells and whistles. * Written by squishy human beings. Software can never be perfect. We will all eventually die.
# Rake::ToolkitProgram Create toolkit programs easily with `Rake` and `OptionParser` syntax. Bash completions and usage help are baked in. ## Installation Add this line to your application's Gemfile: ```ruby gem 'rake-toolkit_program' ``` And then execute: $ bundle Or install it yourself as: $ gem install rake-toolkit_program ## Quickstart * Shebang it up (in a file named `awesome_tool.rb`) ```ruby #!/usr/bin/env ruby ``` * Require the library ```ruby require 'rake/toolkit_program' ``` * Make your life easier ```ruby Program = Rake::ToolkitProgram ``` * Define your command tasks ```ruby Program.command_tasks do desc "Build it" task 'build' do # Ruby code here end desc "Test it" task 'test' => ['build'] do # Rake syntax ↑↑↑↑↑↑↑ for dependencies # Ruby code here end end ``` You can use `Program.args` in your tasks to access the other arguments on the command line. For argument parsing integrated into the help provided by the program, see the use of `Rake::Task(Rake::ToolkitProgram::TaskExt)#parse_args` below. * Wire the mainline ```ruby Program.run(on_error: :exit_program!) if $0 == __FILE__ ``` * In the shell, prepare to run the program (UNIX/Linux systems only) ```console $ chmod +x awesome_tool.rb $ ./awesome_tool.rb --install-completions Completions installed in /home/rtweeks/.bashrc Source /home/rtweeks/.bash-complete/awesome_tool.rb-completions for immediate availability. $ source /home/rtweeks/.bash-complete/awesome_tool.rb-completions ``` * Ask for help ```console $ ./awesome_tool.rb help *** ./awesome_tool.rb Toolkit Program *** . . . ``` ## Usage Let's look at a short sample toolkit program -- put this in `awesome.rb`: ```ruby #!/usr/bin/env ruby require 'rake/toolkit_program' require 'ostruct' ToolkitProgram = Rake::ToolkitProgram ToolkitProgram.title = "My Awesome Toolkit of Awesome" ToolkitProgram.command_tasks do desc <<-END_DESC.dedent Fooing myself I'm not sure what I'm doing, but I'm definitely fooing! END_DESC task :foo do a = ToolkitProgram.args puts "I'm fooed#{' on a ' if a.implement}#{a.implement}" end.parse_args(into: OpenStruct.new) do |parser, args| parser.no_positional_args! parser.on('-i', '--implement IMPLEMENT', 'An implement on which to be fooed') do |val| args.implement = val end end end if __FILE__ == $0 ToolkitProgram.run(on_error: :exit_program!) end ``` Make sure to `chmod +x awesome.rb`! What does this support? $ ./awesome.rb foo I'm fooed $ ./awesome.rb --help *** My Awesome Toolkit of Awesome *** Usage: ./awesome.rb COMMAND [OPTION ...] Avaliable options vary depending on the command given. For details of a particular command, use: ./awesome.rb help COMMAND Commands: foo Fooing myself help Show a list of commands or details of one command Use help COMMAND to get more help on a specific command. $ ./awesome.rb help foo *** My Awesome Toolkit of Awesome *** Usage: ./awesome.rb foo [OPTION ...] Fooing myself I'm not sure what I'm doing, but I'm definitely fooing! Options: -i, --implement IMPLEMENT An implement on which to be fooed $ ./awesome.rb --install-completions Completions installed in /home/rtweeks/.bashrc Source /home/rtweeks/.bash-complete/awesome.rb-completions for immediate availability. $ source /home/rtweeks/.bash-complete/awesome.rb-completions $ ./awesome.rb <tab><tab> foo help $ ./awesome.rb f<tab> ↳ ./awesome.rb foo $ ./awesome.rb foo <tab> ↳ ./awesome.rb foo -- $ ./awesome.rb foo --<tab><tab> --help --implement $ ./awesome.rb foo --i<tab> ↳ ./awesome.rb foo --implement $ ./awesome.rb foo --implement <tab><tab> --help awesome.rb $ ./awesome.rb foo --implement spoon I'm fooed on a spoon ### Defining Toolkit Commands Just define tasks in the block of `Rake::ToolkitProgram.command_tasks` with `task` (i.e. `Rake::DSL#task`). If `desc` is used to provide a description, the task will become visible in help and completions. When a command task is initially defined, positional arguments to the command are available as an `Array` through `Rake::ToolkitProgram.args`. ### Option Parsing This gem extends `Rake::Task` with a `#parse_args` method that creates a `Rake::ToolkitProgram::CommandOptionParser` (derived from the standard library's `OptionParser`) and an argument accumulator and `yield`s them to its block. * The arguments accumulated through the `Rake::ToolkitProgram::CommandOptionParser` are available to the task in `Rake::ToolkitProgram.args`, replacing the normal `Array` of positional arguments. * Use the `into:` keyword of `#parse_args` to provide a custom argument accumulator object for the associated command. The default argument accumulator constructor can be defined with `Rake::ToolkitProgram.default_parsed_args`. Without either of these, the default accumulator is a `Hash`. * Options defined using `OptionParser#on` (or any of the variants) will print in the help for the associated command. ### Positional Arguments Accessing positional arguments given after the command name depends on whether or not `Rake::Task(Rake::ToolkitProgram::TaskExt)#parse_args` has been called on the command task. If this method is not called, positional arguments will be an `Array` accessible through `Rake::ToolkitProgram.args`. When `Rake::Task(Rake::ToolkitProgram::TaskExt)#parse_args` is used: * `Rake::ToolkitProgram::CommandOptionParser#capture_positionals` can be used to define how positional arguments are accumulated. * If the argument accumulator is a `Hash`, the default (without calling this method) is to assign the `Array` of positional arguments to the `nil` key of the `Hash`. * For other types of accumulators, the positional arguments are only accessible if `Rake::ToolkitProgram::CommandOptionParser#capture_positionals` is used to define how they are captured. * If a block is given to this method, the block of the method will receive the `Array` of positional arguments. If it is passed an argument value, that value is used as the key under which to store the positional arguments if the argument accumulator is a `Hash`. * `Rake::ToolkitProgram::CommandOptionParser#expect_positional_cardinality` can be used to set a rule for the count of positional arguments. This will affect the _usage_ presented in the help for the associated command. * `Rake::ToolkitProgram::CommandOptionParser#map_positional_args` may be used to transform (or otherwise process) positional arguments one at a time and in the context of options and/or arguments appearing earlier on the command line. ### Convenience Methods * `Rake::Task(Rake::ToolkitProgram::TaskExt)#prohibit_args` is a quick way, for commands that accept no options or positional arguments, to declare this so the help and bash completions reflect this. It is equivalent to using `#parse_args` and telling the parser `parser.expect_positional_cardinality(0)`. * `Rake::ToolkitProgram::CommandOptionParser#no_positional_args!` is a shortcut for calling `#expect_positional_cardinality(0)` on the same object. * `Rake::Task(Rake::ToolkitProgram::TaskExt)#invalid_args!` and `Rake::ToolkitProgram::CommandOptionParser#invalid_args!` are convenient ways to raise `Rake::ToolkitProgram::InvalidCommandLine` with a message. ## OptionParser in Rubies Before and After v2.4 The `OptionParser` class was extended in Ruby 2.4 to simplify capturing options into a `Hash` or other container implementing `#[]=` in a similar way. This gem supports that, but it means that behavior varies somewhat between the pre-2.4 era and the 2.4+ era. To have consistent behavior across that version change, the recommendation is to use a `Struct`, `OpenStruct`, or custom class to hold program options rather than `Hash`. ## Development After checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment. To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org). To run the tests, use `rake`, `rake test`, or `rspec spec`. Tests can only be run on systems that support `Kernel#fork`, as this is used to present a pristine and isolated environment for setting up the tool. If run using Ruby 2.3 or earlier, some tests will be pending because functionality expects Ruby 2.4's `OptionParser`. ## Contributing Bug reports and pull requests are welcome on GitHub at https://github.com/PayTrace/rake-toolkit_program. For further details on contributing, see [CONTRIBUTING.md](./CONTRIBUTING.md).
QA Robusta is an automation framework easing pain points away from automation test case writers. How is pain relieved? * Elements, such as links, buttons, and other html objects are defined in one location. This ensures over time the user won't have definitions spread out throughout different layers of code requiring time consuming updates if the application under test is modified. * Well defined flows allows the user to have a common means for navigating and controlling interactions with the application under test. This takes all logic out of test classes and yields in higher more modular code re-use. * When an application requiring testing has the elements and flows implemented less code savy resources can easily add new test cases once trained on how to access the flows and elements. * When ever a link or button is clicked a screen shot is taken * Results are available under site/results directory in html format. Report includes the rdoc on a per test class method along with any screen shots taken. Example report: https://cyberconnect.biz/opensource/demo_results.html * Transparent remote Unix command execution leading to well defined interfaces for common task. For example, one may have a class defined specifically for RemoteUnixNetwork. This class would have methods such as, assign_ip, ifup, ifdown, etc. This class then would be able to perform these task on any remote Unix machine. * Executes the same on Windows or Linux/Unix environments. Developers have the freedom to develop on the platform of choice. * Mechanize extension: Allows the user to define a web application's page elements in a YAML format and provide navigation paths accessing the YAML structure to interact with the web application. Users can also perform direct http.post or any other mechanize functionality when defining state-full interfaces to hit a web application without going through a browser.
Authentication / Authorization library for Watermark apps
Inventory Inventory keeps track of the contents of your Ruby¹ projects. Such an inventory can be used to load the project, create gem specifications and gems, run unit tests, compile extensions, and verify that the project’s content is what you think it is. ¹ See http://ruby-lang.org/ § Usage Let’s begin by discussing the project structure that Inventory expects you to use. It’s pretty much exactly the same as the standard Ruby project structure¹: ├── README ├── Rakefile ├── lib │ ├── foo-1.0 │ │ ├── bar.rb │ │ └── version.rb │ └── foo-1.0.rb └── test └── unit ├── foo-1.0 │ ├── bar.rb │ └── version.rb └── foo-1.0.rb Here you see a simplified version of a project called “Foo”’s project structure. The only real difference from the standard is that the main entry point into the library is named “foo-1.0.rb” instead of “foo.rb” and that the root sub-directory of “lib” is similarly named “foo-1.0” instead of “foo”. The difference is the inclusion of the API version. This must be the major version of the project followed by a constant “.0”. The reason for this is that it allows concurrent installations of different major versions of the project and means that the wrong version will never accidentally be loaded with require. There’s a bigger difference in the content of the files. ‹Lib/foo-1.0/version.rb› will contain our inventory instead of a String: require 'inventory-1.0' class Foo Version = Foo.new(1, 4, 0){ authors{ author 'A. U. Thor', 'a.u.thor@example.org' } homepage 'http://example.org/' licenses{ license 'LGPLv3+', 'GNU Lesser General Public License, version 3 or later', 'http://www.gnu.org/licenses/' } def dependencies super + Dependencies.new{ development 'baz', 1, 3, 0 runtime 'goo', 2, 0, 0 optional 'roo-loo', 3, 0, 0, :feature => 'roo-loo' } end def package_libs %w[bar.rb] end } end We’re introducing quite a few concepts at once, and we’ll look into each in greater detail, but we begin by setting the ‹Version› constant to a new instance of an Inventory with major, minor, and patch version atoms 1, 4, and 0. Then we add a couple of dependencies and list the library files that are included in this project. The version numbers shouldn’t come as a surprise. These track the version of the API that we’re shipping using {semantic versioning}². They also allow the Inventory#to_s method to act as if you’d defined Version as ‹'1.4.0'›. Next follows information about the authors of the project, the project’s homepage, and the project’s licenses. Each author has a name and an email address. The homepage is simply a string URL. Licenses have an abbreviation, a name, and a URL where the license text can be found. We then extend the definition of ‹dependencies› by adding another set of dependencies to ‹super›. ‹Super› includes a dependency on the version of the inventory project that’s being used with this project, so you’ll never have to list that yourself. The other three dependencies are all of different kinds: development, runtime, and optional. A development dependency is one that’s required while developing the project, for example, a unit-testing framework, a documentation generator, and so on. Runtime dependencies are requirements of the project to be able to run, both during development and when installed. Finally, optional dependencies are runtime dependencies that may or may not be required during execution. The difference between runtime and optional is that the inventory won’t try to automatically load an optional dependency, instead leaving that up to you to do when and if it becomes necessary. By that logic, runtime dependencies will be automatically loaded, which is a good reason for having dependency information available at runtime. The version numbers of dependencies also use semantic versioning, but note that the patch atom is ignored unless the major atom is 0. You should always only depend on the major and minor atoms. As mentioned, runtime dependencies will be automatically loaded and the feature they try to load is based on the name of the dependency with a “-X.0” tacked on the end, where ‘X’ is the major version of the dependency. Sometimes, this isn’t correct, in which case the :feature option may be given to specify the name of the feature. You may also override other parts of a dependency by passing in a block to the dependency, much like we’re doing for inventories. The rest of an inventory will list the various files included in the project. This project only consists of one additional file to those that an inventory automatically include (Rakefile, README, the main entry point, and the version.rb file that defines the inventory itself), namely the library file ‹bar.rb›. Library files will be loaded automatically when the main entry point file loads the inventory. Library files that shouldn’t be loaded may be listed under a different heading, namely “additional_libs”. Both these sets of files will be used to generate a list of unit test files automatically, so each library file will have a corresponding unit test file in the inventory. We’ll discuss the different headings of an inventory in more detail later on. Now that we’ve written our inventory, let’s set it up so that it’s content gets loaded when our main entry point gets loaded. We add the following piece of code to ‹lib/foo-1.0.rb›: module Foo load File.expand_path('../foo-1.0/version.rb', __FILE__) Version.load end That’s all there’s to it. The inventory can also be used to great effect from a Rakefile using a separate project called Inventory-Rake³. Using it’ll give us tasks for cleaning up our project, compiling extensions, installing dependencies, installing and uninstalling the project itself, and creating and pushing distribution files to distribution points. require 'inventory-rake-1.0' load File.expand_path('../lib/foo-1.0/version.rb', __FILE__) Inventory::Rake::Tasks.define Foo::Version Inventory::Rake::Tasks.unless_installing_dependencies do require 'lookout-rake-3.0' Lookout::Rake::Tasks::Test.new end It’s ‹Inventory::Rake::Tasks.define› that does the heavy lifting. It takes our inventory and sets up the tasks mentioned above. As we want to be able to use our Rakefile to install our dependencies for us, the rest of the Rakefile is inside the conditional #unless_installing_dependencies, which, as the name certainly implies, executes its block unless the task being run is the one that installs our dependencies. This becomes relevant when we set up Travis⁴ integration next. The only conditional set-up we do in our Rakefile is creating our test task via Lookout-Rake⁵, which also uses our inventory to find the unit tests to run when executed. Travis integration is straightforward. Simply put before_script: - gem install inventory-rake -v '~> VERSION' --no-rdoc --no-ri - rake gem:deps:install in the project’s ‹.travis.yml› file, replacing ‹VERSION› with the version of Inventory-Rake that you require. This’ll make sure that Travis installs all development, runtime, and optional dependencies that you’ve listed in your inventory before running any tests. You might also need to put env: - RUBYOPT=rubygems in your ‹.travis.yml› file, depending on how things are set up. ¹ Ruby project structure: http://guides.rubygems.org/make-your-own-gem/ ² Semantic versioning: http://semver.org/ ³ Inventory-Rake: http://disu.se/software/inventory-rake-1.0/ ⁴ Travis: http://travis-ci.org/ ⁵ Lookout-Rake: http://disu.se/software/lookout-rake-3.0/ § API If the guide above doesn’t provide you with all the answers you seek, you may refer to the API¹ for more answers. ¹ See http://disu.se/software/inventory-1.0/api/Inventory/ § Financing Currently, most of my time is spent at my day job and in my rather busy private life. Please motivate me to spend time on this piece of software by donating some of your money to this project. Yeah, I realize that requesting money to develop software is a bit, well, capitalistic of me. But please realize that I live in a capitalistic society and I need money to have other people give me the things that I need to continue living under the rules of said society. So, if you feel that this piece of software has helped you out enough to warrant a reward, please PayPal a donation to now@disu.se¹. Thanks! Your support won’t go unnoticed! ¹ Send a donation: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=now@disu.se&item_name=Inventory § Reporting Bugs Please report any bugs that you encounter to the {issue tracker}¹. ¹ See https://github.com/now/inventory/issues § Authors Nikolai Weibull wrote the code, the tests, the documentation, and this README. § Licensing Inventory is free software: you may redistribute it and/or modify it under the terms of the {GNU Lesser General Public License, version 3}¹ or later², as published by the {Free Software Foundation}³. ¹ See http://disu.se/licenses/lgpl-3.0/ ² See http://gnu.org/licenses/ ³ See http://fsf.org/
README ====== This is a simple API to evaluate information retrieval results. It allows you to load ranked and unranked query results and calculate various evaluation metrics (precision, recall, MAP, kappa) against a previously loaded gold standard. Start this program from the command line with: retreval -l <gold-standard-file> -q <query-results> -f <format> -o <output-prefix> The options are outlined when you pass no arguments and just call retreval You will find further information in the RDOC documentation and the HOWTO section below. If you want to see an example, use this command: retreval -l example/gold_standard.yml -q example/query_results.yml -f yaml -v INSTALLATION ============ If you have RubyGems, just run gem install retreval You can manually download the sources and build the Gem from there by `cd`ing to the folder where this README is saved and calling gem build retreval.gemspec This will create a gem file called which you just have to install with `gem install <file>` and you're done. HOWTO ===== This API supports the following evaluation tasks: - Loading a Gold Standard that takes a set of documents, queries and corresponding judgements of relevancy (i.e. "Is this document relevant for this query?") - Calculation of the _kappa measure_ for the given gold standard - Loading ranked or unranked query results for a certain query - Calculation of _precision_ and _recall_ for each result - Calculation of the _F-measure_ for weighing precision and recall - Calculation of _mean average precision_ for multiple query results - Calculation of the _11-point precision_ and _average precision_ for ranked query results - Printing of summary tables and results Typically, you will want to use this Gem either standalone or within another application's context. Standalone Usage ================ Call parameters --------------- After installing the Gem (see INSTALLATION), you can always call `retreval` from the commandline. The typical call is: retreval -l <gold-standard-file> -q <query-results> -f <format> -o <output-prefix> Where you have to define the following options: - `gold-standard-file` is a file in a specified format that includes all the judgements - `query-results` is a file in a specified format that includes all the query results in a single file - `format` is the format that the files will use (either "yaml" or "plain") - `output-prefix` is the prefix of output files that will be created Formats ------- Right now, we focus on the formats you can use to load data into the API. Currently, we support YAML files that must adhere to a special syntax. So, in order to load a gold standard, we need a file in the following format: * "query" denotes the query * "documents" these are the documents judged for this query * "id" the ID of the document (e.g. its filename, etc.) * "judgements" an array of judgements, each one with: * "relevant" a boolean value of the judgment (relevant or not) * "user" an optional identifier of the user Example file, with one query, two documents, and one judgement: - query: 12th air force germany 1957 documents: - id: g5701s.ict21311 judgements: [] - id: g5701s.ict21313 judgements: - relevant: false user: 2 So, when calling the program, specify the format as `yaml`. For the query results, a similar format is used. Note that it is necessary to specify whether the result sets are ranked or not, as this will heavily influence the calculations. You can specify the score for a document. By "score" we mean the score that your retrieval algorithm has given the document. But this is not necessary. The documents will always be ranked in the order of their appearance, regardless of their score. Thus in the following example, the document with "07" at the end is the first and "25" is the last, regardless of the score. --- query: 12th air force germany 1957 ranked: true documents: - score: 0.44034874 document: g5701s.ict21307 - score: 0.44034874 document: g5701s.ict21309 - score: 0.44034874 document: g5701s.ict21311 - score: 0.44034874 document: g5701s.ict21313 - score: 0.44034874 document: g5701s.ict21315 - score: 0.44034874 document: g5701s.ict21317 - score: 0.44034874 document: g5701s.ict21319 - score: 0.44034874 document: g5701s.ict21321 - score: 0.44034874 document: g5701s.ict21323 - score: 0.44034874 document: g5701s.ict21325 --- query: 1612 ranked: true documents: - score: 1.0174774 document: g3290.np000144 - score: 0.763108 document: g3201b.ct000726 - score: 0.763108 document: g3400.ct000886 - score: 0.6359234 document: g3201s.ct000130 --- **Note**: You can also use the `plain` format, which will load the gold standard in a different way (but not the results): my_query my_document_1 false my_query my_document_2 true See that every query/document/relevancy pair is separated by a tabulator? You can also add the user's ID in the fourth column if necessary. Running the evaluation ----------------------- After you have specified the input files and the format, you can run the program. If needed, the `-v` switch will turn on verbose messages, such as information on how many judgements, documents and users there are, but this shouldn't be necessary. The program will first load the gold standard and then calculate the statistics for each result set. The output files are automatically created and contain a YAML representation of the results. Calculations may take a while depending on the amount of judgements and documents. If there are a thousand judgements, always consider a few seconds for each result set. Interpreting the output files ------------------------------ Two output files will be created: - `output_avg_precision.yml` - `output_statistics.yml` The first lists the average precision for each query in the query result file. The second file lists all supported statistics for each query in the query results file. For example, for a ranked evaluation, the first two entries of such a query result statistic look like this: --- 12th air force germany 1957: - :precision: 0.0 :recall: 0.0 :false_negatives: 1 :false_positives: 1 :true_negatives: 2516 :true_positives: 0 :document: g5701s.ict21313 :relevant: false - :precision: 0.0 :recall: 0.0 :false_negatives: 1 :false_positives: 2 :true_negatives: 2515 :true_positives: 0 :document: g5701s.ict21317 :relevant: false You can see the precision and recall for that specific point and also the number of documents for the contingency table (true/false positives/negatives). Also, the document identifier is given. API Usage ========= Using this API in another ruby application is probably the more common use case. All you have to do is include the Gem in your Ruby or Ruby on Rails application. For details about available methods, please refer to the API documentation generated by RDoc. **Important**: For this implementation, we use the document ID, the query and the user ID as the primary keys for matching objects. This means that your documents and queries are identified by a string and thus the strings should be sanitized first. Loading the Gold Standard ------------------------- Once you have loaded the Gem, you will probably start by creating a new gold standard. gold_standard = GoldStandard.new Then, you can load judgements into this standard, either from a file, or manually: gold_standard.load_from_yaml_file "my-file.yml" gold_standard.add_judgement :document => doc_id, :query => query_string, :relevant => boolean, :user => John There is a nice shortcut for the `add_judgement` method. Both lines are essentially the same: gold_standard.add_judgement :document => doc_id, :query => query_string, :relevant => boolean, :user => John gold_standard << :document => doc_id, :query => query_string, :relevant => boolean, :user => John Note the usage of typical Rails hashes for better readability (also, this Gem was developed to be used in a Rails webapp). Now that you have loaded the gold standard, you can do things like: gold_standard.contains_judgement? :document => "a document", :query => "the query" gold_standard.relevant? :document => "a document", :query => "the query" Loading the Query Results ------------------------- Now we want to create a new `QueryResultSet`. A query result set can contain more than one result, which is what we normally want. It is important that you specify the gold standard it belongs to. query_result_set = QueryResultSet.new :gold_standard => gold_standard Just like the Gold Standard, you can read a query result set from a file: query_result_set.load_from_yaml_file "my-results-file.yml" Alternatively, you can load the query results one by one. To do this, you have to create the results (either ranked or unranked) and then add documents: my_result = RankedQueryResult.new :query => "the query" my_result.add_document :document => "test_document 1", :score => 13 my_result.add_document :document => "test_document 2", :score => 11 my_result.add_document :document => "test_document 3", :score => 3 This result would be ranked, obviously, and contain three documents. Documents can have a score, but this is optional. You can also create an Array of documents first and add them altogether: documents = Array.new documents << ResultDocument.new :id => "test_document 1", :score => 20 documents << ResultDocument.new :id => "test_document 2", :score => 21 my_result = RankedQueryResult.new :query => "the query", :documents => documents The same applies to `UnrankedQueryResult`s, obviously. The order of ranked documents is the same as the order in which they were added to the result. The `QueryResultSet` will now contain all the results. They are stored in an array called `query_results`, which you can access. So, to iterate over each result, you might want to use the following code: query_result_set.query_results.each_with_index do |result, index| # ... end Or, more simply: for result in query_result_set.query_results # ... end Calculating statistics ---------------------- Now to the interesting part: Calculating statistics. As mentioned before, there is a conceptual difference between ranked and unranked results. Unranked results are much easier to calculate and thus take less CPU time. No matter if unranked or ranked, you can get the most important statistics by just calling the `statistics` method. statistics = my_result.statistics In the simple case of an unranked result, you will receive a hash with the following information: * `precision` - the precision of the results * `recall` - the recall of the results * `false_negatives` - number of not retrieved but relevant items * `false_positives` - number of retrieved but nonrelevant * `true_negatives` - number of not retrieved and nonrelevantv items * `true_positives` - number of retrieved and relevant items In case of a ranked result, you will receive an Array that consists of _n_ such Hashes, depending on the number of documents. Each Hash will give you the information at a certain rank, e.g. the following to lines return the recall at the fourth rank. statistics = my_ranked_result.statistics statistics[3][:recall] In addition to the information mentioned above, you can also get for each rank: * `document` - the ID of the document that was returned at this rank * `relevant` - whether the document was relevant or not Calculating statistics with missing judgements ---------------------------------------------- Sometimes, you don't have judgements for all document/query pairs in the gold standard. If this happens, the results will be cleaned up first. This means that every document in the results that doesn't appear to have a judgement will be removed temporarily. As an example, take the following results: * A * B * C * D Our gold standard only contains judgements for A and C. The results will be cleaned up first, thus leading to: * A * C With this approach, we can still provide meaningful results (for precision and recall). Other statistics ---------------- There are several other statistics that can be calculated, for example the **F measure**. The F measure weighs precision and recall and has one parameter, either "alpha" or "beta". Get the F measure like so: my_result.f_measure :beta => 1 If you don't specify either alpha or beta, we will assume that beta = 1. Another interesting measure is **Cohen's Kappa**, which tells us about the inter-agreement of assessors. Get the kappa statistic like this: gold_standard.kappa This will calculate the average kappa for each pairwise combination of users in the gold standard. For ranked results one might also want to calculate an **11-point precision**. Just call the following: my_ranked_result.eleven_point_precision This will return a Hash that has indices at the 11 recall levels from 0 to 1 (with steps of 0.1) and the corresponding precision at that recall level.
$Id: README.txt 204 2010-11-30 02:20:04Z pwilkins $ sm-transcript reads results of SLS processing and produces transcripts for the SpokenMedia browser. For each file in the source folder whose extension matches the source type, a file of destination type is created in the destination folder. All of these parameters have default values. Note: Examples of the commands you enter in the terminal are for *nix. The command prompt in the examples is: felix$ <command line> If you are a Windows user, make the usual adjustments. Requirements: sm-transcript is written in Ruby and packaged as a RubyGem. Since Ruby is not a compiled language, you will need to have Ruby installed on your machine to run sm-transcript. You can determine if Ruby is installed by typing "ruby -v" at a terminal prompt. It should return the version of Ruby that is installed. If Ruby is not installed on your machine, navigate to http://www.ruby-lang.org/ and follow the installation instructions. sm-transcript was developed using Ruby 1.8. Other Ruby versions have not been tested as of this release. Installation: You can get sm-transcript as either a RubyGem or as source from svn. The preferred way to install this package is as a Rubygem. You can download and install the gem with this command: felix$ sudo gem install [--verbose] sm-transcript This command downloads the most recent version of the gem from rubygems.org and makes it active. Previous versions of the gem remain installed, but are deactivated. You must use "sudo" to properly install the gem. If you execute "gem install" (omitting the "sudo") the gem is installed in your home gem repository and it isn't in your path without additional configuration. Note: You need sudo privileges to run the command as written. If you can't sudo, then you can install it locally and will need some additional configuration. Contact me (or your local Ruby wizard) for assistance. The executable is now in your path. You can cleanly uninstall the gem with this command: felix$ sudo gem uninstall sm-transcript If you have access to our svn repository, you are welcome to check out the code. Be warned that the trunk tip is not necessarily stable. It changes frequently as enhancements (and bug fixes) are added. (note that the 'smb_transcript' in the command line below is not a typo.) svn co svn+ssh://svn.mit.edu/oeit-tsa/SMB/smb_transcript/trunk sm_transcript build the gem by running this command from the directory you installed the source. This is what it looks like on my machine: felix$ rake gem The gem will be built and put in ./pkg You can now use the gem installation instructions above. Using the App: Run with no command line parameters, the app reads *.wrd files out of ./results and writes *.t1.html files to ./transcripts. These directories are relative to where sm_transcript is called. Note: destination files are overwritten without a warning prompt. If you want to preserve an existing output file, rename it before running the app again. For example, run the app by navigating to the bin folder and enter projects/sm_transcript/bin felix$ sm_transcript This command run from this folder will read *.wrd files from bin/results and write *-t1.html to bin/transcripts. Usage: sm_transcript [options] --srcdir PATH Read files from this folder (Default: ./results) --destdir PATH Write files to this folder (Default: ./transcripts) --srctype wrd | seg | txt | ttml | srt Kind of file to process (Default: wrd) --desttype html | ttml | datajs | json Kind of file to output (Default: html) -h, --help Show this message There is a serious gotch'a in specifying the srctype parameter: it must match the case of the file extension that you're processing. This means that if the srt files that you are processing have the extension .SRT, then you must specify the srctype as "SRT". Pretty lame, I know. I will update the gem with a fix shortly. My apologies until then. Troubleshooting: sm-transcript requires additional gems to operate. The RubyGem installation should install dependencies automatically, but when it doesn't, you get an error that includes ... no such file to load -- builder (LoadError) in the first few lines when you run sm-transcript, the problem is a missing dependent gem. (the error above indicates that the Builder gem is missing.) Try installing the missing gem. For the error above, the command looks like this on my computer: felix$ sudo gem install builder See "Required Gems" below for more information. A warning message such as: "WARNING: Nokogiri was built against LibXML version 2.7.6, but has dynamically loaded 2.7.7"" may be safely ignored. If you continue to have trouble, feel free to contact me. Upgrading: You can easily upgrade by simply executing the same command you used to install the gem. Running install again will add the newer version and make it active. By default the most recent version is used, but older versions are still available, simply inactive. If are using svn, you should already know what to do. Required Gems: builder - create structured data, such as XML extensions - added for the 'require_relative' command. (To get this command in Ruby 1.8 you need to install this gem, for Ruby 1.9 the command is already part of the core.) htmlentities - html parsing json - create JSON structured data nokogiri - xml parsing library optparse - option parsing of command line ostruct - open data structures ppcommand - pp is a pretty printer. It is used only for debugging rake - make for Ruby rubygems - support for gems (shouldn't be needed for Ruby 1.9) shoulda - enhancement for Test::Unit This command installs gems on OSX and Linux: felix$ sudo gem install <gem name> I recommend running the following command to update to latest version of rubygems before loading new gems. felix$ sudo gem update --system Unit Tests: You may run all unit tests by navigating to the test folder and running rake with no parameters (the default rake task runs all tests). On my computer, it looks like this: projects/sm_transcript/test felix$ rake Release Notes: Initial Version - runs under Ruby 1.8.x. version 0.0.4 - fixes bug when processing .WRD files with CRLF line endings. version 0.0.5 - removed due to posting error version 0.0.6 - added srctype of ttml and desttype of json, fixed bug where beginning time of word was actually for previous word. version 0.0.7 - added srt as srctype version 0.0.8 - fixed bug that dropped last phrase from transcripts version 1.0.0 - declared this version 1.0.0 to conform more closely with gem numbering conventions. All tests run successfully. To Do: - specify individual files for processing rather than folders - fix bug in srt processing: can't read Creole srt content. - allow user to modify the "t1" file extension for addition languages of the same transcript. - update code to run under Ruby 1.9
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.
No description provided.