From e1f60b292ffd61151403327aa19ff7a1871820bd Mon Sep 17 00:00:00 2001 From: Nishanth Menon Date: Wed, 13 Oct 2010 00:13:10 +0200 Subject: PM: Introduce library for device-specific OPPs (v7) SoCs have a standard set of tuples consisting of frequency and voltage pairs that the device will support per voltage domain. These are called Operating Performance Points or OPPs. The actual definitions of OPP varies over silicon versions. For a specific domain, we can have a set of {frequency, voltage} pairs. As the kernel boots and more information is available, a default set of these are activated based on the precise nature of device. Further on operation, based on conditions prevailing in the system (such as temperature), some OPP availability may be temporarily controlled by the SoC frameworks. To implement an OPP, some sort of power management support is necessary hence this library depends on CONFIG_PM. Contributions include: Sanjeev Premi for the initial concept: http://patchwork.kernel.org/patch/50998/ Kevin Hilman for converting original design to device-based. Kevin Hilman and Paul Walmsey for cleaning up many of the function abstractions, improvements and data structure handling. Romit Dasgupta for using enums instead of opp pointers. Thara Gopinath, Eduardo Valentin and Vishwanath BS for fixes and cleanups. Linus Walleij for recommending this layer be made generic for usage in other architectures beyond OMAP and ARM. Mark Brown, Andrew Morton, Rafael J. Wysocki, Paul E. McKenney for valuable improvements. Discussions and comments from: http://marc.info/?l=linux-omap&m=126033945313269&w=2 http://marc.info/?l=linux-omap&m=125482970102327&w=2 http://marc.info/?t=125809247500002&r=1&w=2 http://marc.info/?l=linux-omap&m=126025973426007&w=2 http://marc.info/?t=128152609200064&r=1&w=2 http://marc.info/?t=128468723000002&r=1&w=2 incorporated. v1: http://marc.info/?t=128468723000002&r=1&w=2 Signed-off-by: Nishanth Menon Signed-off-by: Kevin Hilman Signed-off-by: Rafael J. Wysocki --- Documentation/power/00-INDEX | 2 + Documentation/power/opp.txt | 375 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 377 insertions(+) create mode 100644 Documentation/power/opp.txt (limited to 'Documentation') diff --git a/Documentation/power/00-INDEX b/Documentation/power/00-INDEX index fb742c213c9e..45e9d4a91284 100644 --- a/Documentation/power/00-INDEX +++ b/Documentation/power/00-INDEX @@ -14,6 +14,8 @@ interface.txt - Power management user interface in /sys/power notifiers.txt - Registering suspend notifiers in device drivers +opp.txt + - Operating Performance Point library pci.txt - How the PCI Subsystem Does Power Management pm_qos_interface.txt diff --git a/Documentation/power/opp.txt b/Documentation/power/opp.txt new file mode 100644 index 000000000000..44d87ad3cea9 --- /dev/null +++ b/Documentation/power/opp.txt @@ -0,0 +1,375 @@ +*=============* +* OPP Library * +*=============* + +(C) 2009-2010 Nishanth Menon , Texas Instruments Incorporated + +Contents +-------- +1. Introduction +2. Initial OPP List Registration +3. OPP Search Functions +4. OPP Availability Control Functions +5. OPP Data Retrieval Functions +6. Cpufreq Table Generation +7. Data Structures + +1. Introduction +=============== +Complex SoCs of today consists of a multiple sub-modules working in conjunction. +In an operational system executing varied use cases, not all modules in the SoC +need to function at their highest performing frequency all the time. To +facilitate this, sub-modules in a SoC are grouped into domains, allowing some +domains to run at lower voltage and frequency while other domains are loaded +more. The set of discrete tuples consisting of frequency and voltage pairs that +the device will support per domain are called Operating Performance Points or +OPPs. + +OPP library provides a set of helper functions to organize and query the OPP +information. The library is located in drivers/base/power/opp.c and the header +is located in include/linux/opp.h. OPP library can be enabled by enabling +CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on +CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to +optionally boot at a certain OPP without needing cpufreq. + +Typical usage of the OPP library is as follows: +(users) -> registers a set of default OPPs -> (library) +SoC framework -> modifies on required cases certain OPPs -> OPP layer + -> queries to search/retrieve information -> + +OPP layer expects each domain to be represented by a unique device pointer. SoC +framework registers a set of initial OPPs per device with the OPP layer. This +list is expected to be an optimally small number typically around 5 per device. +This initial list contains a set of OPPs that the framework expects to be safely +enabled by default in the system. + +Note on OPP Availability: +------------------------ +As the system proceeds to operate, SoC framework may choose to make certain +OPPs available or not available on each device based on various external +factors. Example usage: Thermal management or other exceptional situations where +SoC framework might choose to disable a higher frequency OPP to safely continue +operations until that OPP could be re-enabled if possible. + +OPP library facilitates this concept in it's implementation. The following +operational functions operate only on available opps: +opp_find_freq_{ceil, floor}, opp_get_voltage, opp_get_freq, opp_get_opp_count +and opp_init_cpufreq_table + +opp_find_freq_exact is meant to be used to find the opp pointer which can then +be used for opp_enable/disable functions to make an opp available as required. + +WARNING: Users of OPP library should refresh their availability count using +get_opp_count if opp_enable/disable functions are invoked for a device, the +exact mechanism to trigger these or the notification mechanism to other +dependent subsystems such as cpufreq are left to the discretion of the SoC +specific framework which uses the OPP library. Similar care needs to be taken +care to refresh the cpufreq table in cases of these operations. + +WARNING on OPP List locking mechanism: +------------------------------------------------- +OPP library uses RCU for exclusivity. RCU allows the query functions to operate +in multiple contexts and this synchronization mechanism is optimal for a read +intensive operations on data structure as the OPP library caters to. + +To ensure that the data retrieved are sane, the users such as SoC framework +should ensure that the section of code operating on OPP queries are locked +using RCU read locks. The opp_find_freq_{exact,ceil,floor}, +opp_get_{voltage, freq, opp_count} fall into this category. + +opp_{add,enable,disable} are updaters which use mutex and implement it's own +RCU locking mechanisms. opp_init_cpufreq_table acts as an updater and uses +mutex to implment RCU updater strategy. These functions should *NOT* be called +under RCU locks and other contexts that prevent blocking functions in RCU or +mutex operations from working. + +2. Initial OPP List Registration +================================ +The SoC implementation calls opp_add function iteratively to add OPPs per +device. It is expected that the SoC framework will register the OPP entries +optimally- typical numbers range to be less than 5. The list generated by +registering the OPPs is maintained by OPP library throughout the device +operation. The SoC framework can subsequently control the availability of the +OPPs dynamically using the opp_enable / disable functions. + +opp_add - Add a new OPP for a specific domain represented by the device pointer. + The OPP is defined using the frequency and voltage. Once added, the OPP + is assumed to be available and control of it's availability can be done + with the opp_enable/disable functions. OPP library internally stores + and manages this information in the opp struct. This function may be + used by SoC framework to define a optimal list as per the demands of + SoC usage environment. + + WARNING: Do not use this function in interrupt context. + + Example: + soc_pm_init() + { + /* Do things */ + r = opp_add(mpu_dev, 1000000, 900000); + if (!r) { + pr_err("%s: unable to register mpu opp(%d)\n", r); + goto no_cpufreq; + } + /* Do cpufreq things */ + no_cpufreq: + /* Do remaining things */ + } + +3. OPP Search Functions +======================= +High level framework such as cpufreq operates on frequencies. To map the +frequency back to the corresponding OPP, OPP library provides handy functions +to search the OPP list that OPP library internally manages. These search +functions return the matching pointer representing the opp if a match is +found, else returns error. These errors are expected to be handled by standard +error checks such as IS_ERR() and appropriate actions taken by the caller. + +opp_find_freq_exact - Search for an OPP based on an *exact* frequency and + availability. This function is especially useful to enable an OPP which + is not available by default. + Example: In a case when SoC framework detects a situation where a + higher frequency could be made available, it can use this function to + find the OPP prior to call the opp_enable to actually make it available. + rcu_read_lock(); + opp = opp_find_freq_exact(dev, 1000000000, false); + rcu_read_unlock(); + /* dont operate on the pointer.. just do a sanity check.. */ + if (IS_ERR(opp)) { + pr_err("frequency not disabled!\n"); + /* trigger appropriate actions.. */ + } else { + opp_enable(dev,1000000000); + } + + NOTE: This is the only search function that operates on OPPs which are + not available. + +opp_find_freq_floor - Search for an available OPP which is *at most* the + provided frequency. This function is useful while searching for a lesser + match OR operating on OPP information in the order of decreasing + frequency. + Example: To find the highest opp for a device: + freq = ULONG_MAX; + rcu_read_lock(); + opp_find_freq_floor(dev, &freq); + rcu_read_unlock(); + +opp_find_freq_ceil - Search for an available OPP which is *at least* the + provided frequency. This function is useful while searching for a + higher match OR operating on OPP information in the order of increasing + frequency. + Example 1: To find the lowest opp for a device: + freq = 0; + rcu_read_lock(); + opp_find_freq_ceil(dev, &freq); + rcu_read_unlock(); + Example 2: A simplified implementation of a SoC cpufreq_driver->target: + soc_cpufreq_target(..) + { + /* Do stuff like policy checks etc. */ + /* Find the best frequency match for the req */ + rcu_read_lock(); + opp = opp_find_freq_ceil(dev, &freq); + rcu_read_unlock(); + if (!IS_ERR(opp)) + soc_switch_to_freq_voltage(freq); + else + /* do something when we cant satisfy the req */ + /* do other stuff */ + } + +4. OPP Availability Control Functions +===================================== +A default OPP list registered with the OPP library may not cater to all possible +situation. The OPP library provides a set of functions to modify the +availability of a OPP within the OPP list. This allows SoC frameworks to have +fine grained dynamic control of which sets of OPPs are operationally available. +These functions are intended to *temporarily* remove an OPP in conditions such +as thermal considerations (e.g. don't use OPPx until the temperature drops). + +WARNING: Do not use these functions in interrupt context. + +opp_enable - Make a OPP available for operation. + Example: Lets say that 1GHz OPP is to be made available only if the + SoC temperature is lower than a certain threshold. The SoC framework + implementation might choose to do something as follows: + if (cur_temp < temp_low_thresh) { + /* Enable 1GHz if it was disabled */ + rcu_read_lock(); + opp = opp_find_freq_exact(dev, 1000000000, false); + rcu_read_unlock(); + /* just error check */ + if (!IS_ERR(opp)) + ret = opp_enable(dev, 1000000000); + else + goto try_something_else; + } + +opp_disable - Make an OPP to be not available for operation + Example: Lets say that 1GHz OPP is to be disabled if the temperature + exceeds a threshold value. The SoC framework implementation might + choose to do something as follows: + if (cur_temp > temp_high_thresh) { + /* Disable 1GHz if it was enabled */ + rcu_read_lock(); + opp = opp_find_freq_exact(dev, 1000000000, true); + rcu_read_unlock(); + /* just error check */ + if (!IS_ERR(opp)) + ret = opp_disable(dev, 1000000000); + else + goto try_something_else; + } + +5. OPP Data Retrieval Functions +=============================== +Since OPP library abstracts away the OPP information, a set of functions to pull +information from the OPP structure is necessary. Once an OPP pointer is +retrieved using the search functions, the following functions can be used by SoC +framework to retrieve the information represented inside the OPP layer. + +opp_get_voltage - Retrieve the voltage represented by the opp pointer. + Example: At a cpufreq transition to a different frequency, SoC + framework requires to set the voltage represented by the OPP using + the regulator framework to the Power Management chip providing the + voltage. + soc_switch_to_freq_voltage(freq) + { + /* do things */ + rcu_read_lock(); + opp = opp_find_freq_ceil(dev, &freq); + v = opp_get_voltage(opp); + rcu_read_unlock(); + if (v) + regulator_set_voltage(.., v); + /* do other things */ + } + +opp_get_freq - Retrieve the freq represented by the opp pointer. + Example: Lets say the SoC framework uses a couple of helper functions + we could pass opp pointers instead of doing additional parameters to + handle quiet a bit of data parameters. + soc_cpufreq_target(..) + { + /* do things.. */ + max_freq = ULONG_MAX; + rcu_read_lock(); + max_opp = opp_find_freq_floor(dev,&max_freq); + requested_opp = opp_find_freq_ceil(dev,&freq); + if (!IS_ERR(max_opp) && !IS_ERR(requested_opp)) + r = soc_test_validity(max_opp, requested_opp); + rcu_read_unlock(); + /* do other things */ + } + soc_test_validity(..) + { + if(opp_get_voltage(max_opp) < opp_get_voltage(requested_opp)) + return -EINVAL; + if(opp_get_freq(max_opp) < opp_get_freq(requested_opp)) + return -EINVAL; + /* do things.. */ + } + +opp_get_opp_count - Retrieve the number of available opps for a device + Example: Lets say a co-processor in the SoC needs to know the available + frequencies in a table, the main processor can notify as following: + soc_notify_coproc_available_frequencies() + { + /* Do things */ + rcu_read_lock(); + num_available = opp_get_opp_count(dev); + speeds = kzalloc(sizeof(u32) * num_available, GFP_KERNEL); + /* populate the table in increasing order */ + freq = 0; + while (!IS_ERR(opp = opp_find_freq_ceil(dev, &freq))) { + speeds[i] = freq; + freq++; + i++; + } + rcu_read_unlock(); + + soc_notify_coproc(AVAILABLE_FREQs, speeds, num_available); + /* Do other things */ + } + +6. Cpufreq Table Generation +=========================== +opp_init_cpufreq_table - cpufreq framework typically is initialized with + cpufreq_frequency_table_cpuinfo which is provided with the list of + frequencies that are available for operation. This function provides + a ready to use conversion routine to translate the OPP layer's internal + information about the available frequencies into a format readily + providable to cpufreq. + + WARNING: Do not use this function in interrupt context. + + Example: + soc_pm_init() + { + /* Do things */ + r = opp_init_cpufreq_table(dev, &freq_table); + if (!r) + cpufreq_frequency_table_cpuinfo(policy, freq_table); + /* Do other things */ + } + + NOTE: This function is available only if CONFIG_CPU_FREQ is enabled in + addition to CONFIG_PM as power management feature is required to + dynamically scale voltage and frequency in a system. + +7. Data Structures +================== +Typically an SoC contains multiple voltage domains which are variable. Each +domain is represented by a device pointer. The relationship to OPP can be +represented as follows: +SoC + |- device 1 + | |- opp 1 (availability, freq, voltage) + | |- opp 2 .. + ... ... + | `- opp n .. + |- device 2 + ... + `- device m + +OPP library maintains a internal list that the SoC framework populates and +accessed by various functions as described above. However, the structures +representing the actual OPPs and domains are internal to the OPP library itself +to allow for suitable abstraction reusable across systems. + +struct opp - The internal data structure of OPP library which is used to + represent an OPP. In addition to the freq, voltage, availability + information, it also contains internal book keeping information required + for the OPP library to operate on. Pointer to this structure is + provided back to the users such as SoC framework to be used as a + identifier for OPP in the interactions with OPP layer. + + WARNING: The struct opp pointer should not be parsed or modified by the + users. The defaults of for an instance is populated by opp_add, but the + availability of the OPP can be modified by opp_enable/disable functions. + +struct device - This is used to identify a domain to the OPP layer. The + nature of the device and it's implementation is left to the user of + OPP library such as the SoC framework. + +Overall, in a simplistic view, the data structure operations is represented as +following: + +Initialization / modification: + +-----+ /- opp_enable +opp_add --> | opp | <------- + | +-----+ \- opp_disable + \-------> domain_info(device) + +Search functions: + /-- opp_find_freq_ceil ---\ +-----+ +domain_info<---- opp_find_freq_exact -----> | opp | + \-- opp_find_freq_floor ---/ +-----+ + +Retrieval functions: ++-----+ /- opp_get_voltage +| opp | <--- ++-----+ \- opp_get_freq + +domain_info <- opp_get_opp_count -- cgit v1.2.1