Tasks

apetype tasks module takes the ConfigBase to build an inheritable TaskBase class around it. Through using inheritance or including other TaskBase classes as dependencies, the result is an implicit pipeline. For explicit pipelines where the input and output of subsequent dependencies needs to be fitted, see the apetype.pipelines.

In the example below, TaskBase dependencies is illustrated, and subtasks that are executed in a subprocess with either a shell or python2 interpreter. Calling a subprocess can be done by accessing the env attribute and specifying the command, followed by passing the string script, or more elegantly by specifying a subtask that only has a docstr. The first line in the docstr needs to specify the environment by either the simple str ‘sh’, ‘py’ for Python3, ‘py2’ for Python2, or ‘R’, or by writing a full shebang, i.e.: ‘#!/usr/bin/env bash’.

In the task subtasks, self is replaced with _. This is to make it clearer, that these methods are usually not called directly by the end-user. However, if so desired, self can be used. Furthermore, when writing tests, it can be appropriate to call the subtask methods directly, allowing to control what is being injected.

Example:

>>> import apetype as at
... class TaskDep(at.TaskBase):
...     a: str = '/tmp/file1'
... 
...     def generate_output(_, a) -> str:
...         return a    
... 
... class Task(at.TaskBase):    
...     # Task settings
...     a: int = 10
...     b: str = 'a'
... 
...     # Task dependencies
...     task_dependence1: TaskDep
... 
...     # Subtasks
...     def generate_output1(_, task_dependence1) -> int:
...         print(task_dependence1.a)
...         return 0
...     
...     def generate_output2(_) -> str:
...         with _.env('sh') as env:
...             env.exec('which python')
...             return env.output
... 
...     def generate_output3(_, a) -> str:
...         '''py2
...         for i in range({{a}}):
...             print i
...         '''
... 
... task = Task()
... task.run()
... print(task._input, task._output)
class apetype.tasks.InjectCopy

To avoid having side effects on injected parameters, objects such as a pd.DataFrame can use this to inject a copy instead

class apetype.tasks.InjectInterface

Any InjectInterface (II) has to define a __call__ routine that takes the parameter that was annotated with the II, the subtask name and a flag dict. Flags can be modified, but __call__ should only return the transformed parameter.

class apetype.tasks.InjectItems

The passed parameter should be of type list, tuple, or generator. An enumerate of the parameter is returned.

class apetype.tasks.PrintInject

Mixin class to add print diverting logic to a task. Classes that inherit this class, next to TaskBase can inject print in the subtasks.

print(*args, **kwargs)

Method that can be used instead of print, to capture the stdout.

When called without args or kwargs, returns the current buffer and resets it.

class apetype.tasks.ReturnTypeInterface(type)
class apetype.tasks.RunInterface
class apetype.tasks.SKIP(type)
class apetype.tasks.SKIPCACHE(type)
class apetype.tasks.TaskBase(parse=False, prefix=True, run=False)